Tag: web

I analyzed Facebook data to decide when to stream on Facebook Live. Here’s how.

I analyzed Facebook data to decide when to stream on Facebook Live. Here’s how.

Streaming on Facebook Live can be a powerful marketing strategy for startups and businesses to share knowledge, provide value, get exposure and collect high-quality leads. When preparing your Facebook Live session upfront by researching about your target audience and building a detailed agenda, the session can boost your business dramatically.

As a chief of product and technology of my previous startup dealing with fraud detection, I decided to try Facebook Live as a new marketing strategy when it was still fairly new. Back then, once a new Facebook Live session was up, relevant people got Facebook notifications to join the session, which increased the exposure even more.

There are many posts talking about how to better build your Facebook live session in terms of topics to cover, building an agenda, camera angles, session duration, etc. But there is one piece of the puzzle that business owners and marketers often tend to forget or not pay attention to – when is the best time to stream your Facebook Live session?

Facebook live

Answering this question can be done using an educated guess based on your familiarity with the target audience, For example:

  • Pregnant moms are ready to consume your Live session on Monday afternoon.
  • Teenagers at the ages of 18-22 are in the right mindset on Saturday morning.

But nowadays, when there is so much data around us that we can use with a few clicks of a button, you actually stay behind if you don’t make a proper usage of some of the data available.

Almost every marketing platform or social network opens API services that you, as a technological entrepreneur, can easily consume. This data, when analyzed properly, can be derived to valuable conclusions that can drive your business objectives way beyond your competitors.

This approach is often called Data-driven decisions.

Once you start justifying any (or at least most) of your business decisions using data you own or data you can collect from different resources, you actually stop guessing and start making data-driven decisions.

I like to think of data-driven decisions as crowd-sourcing. If you had a chance to watch this TED talk by Lior Zoref, where Lior invited an ox to the stage and asked the audience to guess what its weight is, you were probably overwhelmed by how accurate the crowd’s average was compared to the real weight of the ox: 1,792lbs vs. 1795lbs!

Ox weight

When you’re making guesses regarding your business objectives as individuals, you’re not different than any individual sitting in the crowd and trying to evaluate the ox’s weight. You can even be the one who guessed 300lbs or 8000lbs, which will probably cost your business a lot of unnecessary expenses.

But, if you’re using the wisdom of the crowd to make data-driven decisions, which you can do in almost any decision you make online, you’ll most likely be ahead of every other individual, or in business terms, ahead of your competitors.

Although I’m not a pure marketer, with basic data analysis skills, I can push my business forward dramatically in all aspects, including marketing.

In this post, I’m going to walk you through a practical step-by-step guide about how to access Facebook data, analyze it based on our needs on deriving conclusions about when is the optimized time to broadcast on Facebook Live.

In order to follow this guide you need:

  • A Facebook account
  • A Facebook group you would like to analyze (if it’s a private one, you need to be a group member)
  • Python 2.7 installed
  • Jupyter notebook installed
  • Facebook graph API Python library installed

A Jupyter notebook as a highly recommended tool for data analysis in Python. It has a lot of highlights, but on top of all, it enables you to run snippets of code and save the results in memory, instead of running all of your scripts over and over again every time you implement a minor change. This is crucial when doing data analysis because some tasks can take a lot of execution time.

Although it’s not essential, I always recommend working inside a Python virtual environment. Here is a post I wrote about the advantages of a virtual environment when using Python.

Finally, I highly recommend working on an Ubuntu environment when doing data-analysis using Jupyter notebooks.

Step 1: Getting the Facebook group ID

In order to get data from Facebook API, we need to specify the ID of the entity we want to get data from, in our case, a Facebook group.

Lookup-id.com is a nice tool you can use to easily find the ID of a group based on its URL. Copy the URL of your group and paste it in the search bar.

lookup-id

In this post, we will use the group: Web Design and Development.

ID: 319479604815804

Step 2: Getting to know the Graph API Explorer

In order to get the most out of Facebook API, besides documentation, Facebook has developed a playground for developers called the Graph API Explorer.

The Graph API Explorer enables us to easily get a temporary access token and start examining the capabilities that Facebook API has to offer.

Click on Get Token, and then on Get User Access Token. Don’t select any permission, just click Get Access Token.

Get access token

Facebook API has many endpoints you can use, but in this guide, we are going to use two main endpoints:

In order to figure out the structure of the response you’re expecting to get from a specific endpoint, you just need to specify the endpoint URL and click Submit.

Let’s examine the URL endpoint for grabbing the last posts from the group’s feed. Type this URL in the Graph API Explorer:

319479604815804/feed

and hit Submit.

feed_endpoint

You should now see the last posts from the group’s feed in a JSON structure, containing the post’s content, its id and the updated time. By clicking on one of the id’s and adding to the end of the URL:

319479604815804_1468216989942054/reactions?summary=total_count

You should see a list of the reactions for the specific post, and a summary of the total count of reactions.

This way you can play around with all of the features the Facebook API has to offer.

Another wonderful tool for examining API endpoints of APIs which don’t offer a playground is Postman. You can read more about this tool, as well as essential tools for web developers here.

Step 3: Our plan and assumptions

Our goal is to find the optimized time interval to have a Facebook Live session in the chosen group that contains our target audience. In order to do that, we assume that we the more activity there is in the group at a specific time, the most likely our Facebook Live session with gain more traction.

So our goal now is to figure out when there is a peak in the group’s activity over time. And by when I mean a specific weekday and time.

In order to do that, we are going to grab the last 5000 posts from the group’s feed and plot the distribution of the times they were updated on.

We assume that longer posts indicate on more activity in the group because members spend more time in the group writing them. Therefore, our next step will be to take into consideration the length of each post in the distribution.

Reaction on Facebook is probably a great indication of people engaging with a specific post. Therefore, our last step will be to collect the total number of reactions for each post and take it into account in the distribution of activity over weekdays and hours.

Because Reactions probably come after the post, we should be cautious using this data analysis approach.

Step 4: Let’s analyze some data!

In order to start a Jupyter notebook, you should probably execute:

ipython notebook

and then choose New → Python 2.

new_notebook

In order to analyze and plot the data, we are going to use numpy and matplotlib libraries. These are very popular Python libraries you should use in order to better analyze your data.

Let’s import all the libraries we need:

import matplotlib.pyplot as plt
import numpy as np
import facebook
import urlparse
import datetime
import requests

and specify our access token and group id:

ACCESS_TOKEN = 'INSERT_ACCESS_TOKEN_HERE'
GROUP_ID = '319479604815804' # Web Design and Development group

Then, let’s initialize the API object with our access token:

graph = facebook.GraphAPI(ACCESS_TOKEN)

Now we want to grab the posts from the group’s feed. In order to avoid errors during the API calls, we will limit each API call to 50 posts and iterate over 100 API calls:

posts = []
url = "{}/feed?limit=50".format(GROUP_ID)
until = None
for i in xrange(100):
    if until is not None:
        url += "&until={}".format(until)
    response = graph.request(url)
    data = response.get('data')
    if not data:
        break
    posts = posts + data
    next_url = response.get("paging").get("next")
    parsed_url = urlparse.urlparse(next_url)
    until = urlparse.parse_qs(parsed_url.query)["until"][0]

In each API call, we specify the until parameter to get older posts.

Now, let’s organize the posts into weekdays and hours of the day:

weekdays = {i: 0 for i in xrange(7)}

hours_of_day = {i: 0 for i in xrange(24)}

hours_of_week = np.zeros((7,24), dtype=np.int)
for post in posts:
    updated = datetime.datetime.strptime(post.get("updated_time"), "%Y-%m-%dT%H:%M:%S+0000")
    weekday = updated.weekday()
    hour_of_day = updated.hour
    weekdays[weekday] += 1
    hours_of_day[hour_of_day] += 1
    hours_of_week[weekday][hour_of_day] += 1

and then, plot the results using matplotlib bar charts:

plt.bar(weekdays.keys(), weekdays.values(), color='g')
plt.show()

weekdays_1

(0 represents Monday)

plt.bar(hours_of_day.keys(), hours_of_day.values(), color='r')
plt.show()

hours_1

All times specified in IST.

Only with this very basic analysis, we can already learn a lot about better and or worse time slots for broadcasting on this group. But it seems not informative enough. Maybe because the data is divided into 2 graphs and missing some critical information.

Let’s try to present a heat map of the data, that enables us to see 3D information:

plt.imshow(hours_of_week, cmap='hot')
plt.show()

heatmap_1

Well, this is much better! We can clearly see that the group is very active on Mondays to Fridays between 6 am and 10 am.

Now let’s try to take into consideration to post length and see how it affects the results:

weekdays_content = {i: 0 for i in xrange(7)}
hours_of_day_content = {i: 0 for i in xrange(24)}
hours_of_week_content = np.zeros((7,24), dtype=np.int)
for post in posts:
    updated = datetime.datetime.strptime(post.get("updated_time"), "%Y-%m-%dT%H:%M:%S+0000")
    weekday = updated.weekday()
    hour_of_day = updated.hour
    content_length = len(post["message"]) if "message" in post else 1
    weekdays_content[weekday] += content_length
    hours_of_day_content[hour_of_day] += content_length
    hours_of_week_content[weekday][hour_of_day] += content_length

The heatmap we get:

heatmap2

This is nice but should be treated with caution. On one hand, we can clearly see a very specific point in time that is the optimized time slot to have our Facebook Live session. On the other hand, it might be an outlier of a super long post.

I’ll leave it to you to figure it out in your next data analysis project. What I suggest you do is to take a larger amount of posts or grab an older batch of 5000 posts from the group’s feed.

In order to take reactions into account when analyzing the data, we need to make another API call for each post, because it’s a different API endpoint:

weekdays_reactions = {i: 0 for i in xrange(7)}
hours_of_day_reactions = {i: 0 for i in xrange(24)}
hours_of_week_reactions = np.zeros((7,24), dtype=np.int)
for i, post in enumerate(posts):
    url = "https://graph.facebook.com/v2.10/{id}/reactions?access_token={token}&summary=total_count".format(
    id=post["id"],
        token=ACCESS_TOKEN
    )

    headers = {
        "Host": "graph.facebook.com"
    }

    response = requests.get(url, headers=headers)

    try:
        total_reactions = 1 + response.json().get("summary").get("total_count")
    except:
        total_reactions = 1

    updated = datetime.datetime.strptime(post.get("updated_time"), "%Y-%m-%dT%H:%M:%S+0000")
    weekday = updated.weekday()
    hour_of_day = updated.hour
    weekdays_reactions[weekday] += total_reactions
    hours_of_day_reactions[hour_of_day] += total_reactions
    hours_of_week_reactions[weekday][hour_of_day] += total_reactions

The reason we used a low-level approach here by specifying the exact HTTP request and not using the Facebook Python library is that it doesn’t support the last version of the Facebook API that is required when querying the Reactions endpoint.

The heat map generated from this data:

heatmap_3

We can conclude that the three approaches we used agreed on Monday and Wednesday, 6-7am.

Conclusions

Data analysis can be challenging and often requires creativity. But it also exciting and very rewarding.

After choosing our time to broadcast on Facebook Live based on the analysis presented here, we had a huge success and a lot of traction during our Live session.

I encourage you to try and use data analysis to make data-driven decisions in your next business move. And on top of that, start thinking in terms of data-driven decisions.

You can find the Github repository here.

 

Deploy Django app: Nginx, Gunicorn, PostgreSQL & Supervisor

Deploy Django app: Nginx, Gunicorn, PostgreSQL & Supervisor

Django is the most popular Python-based web framework for a while now. Django is powerful, robust, full of capabilities and surrounded by a supportive community. Django is based on models, views and templates, similarly to other MVC frameworks out there.

Django provides you with a development server out of the box once you start a new project using the commands:

$ django-admin startproject my_project 
$ python ./manage.py runserver 8000

With two lines in the terminal, you can have a working development server on your local machine so you can start coding. One of the tricky parts when it comes to Django is deploying the project so it will be available from different devices around the globe. As technological entrepreneurs, we need to not only develop apps with backend and frontend but also deploy them to a production environment which has to be modular, maintainable and of course secure.

django dev server

Deployment of a Django app requires different mechanisms which will be listed. Before we begin, we need to perform an alignment in terms of the tools we are going to use throughout this post:

  1. Python version 2.7.6
  2. Django version 1.11
  3. Linux Ubuntu server hosted on DigitalOcean cloud provider
  4. Linux Ubuntu local machine
  5. Git repository containing your codebase

I assume you are already using 1, 2, 4 and 5. About the Linux server, we are about to create it together during the first step of the deployment tutorial. Please note that this post discusses deployment on a single Ubuntu server. This configuration is great for small projects, but in order to scale your resources up to support larger amounts of traffic, you should consider a high-availability server infrastructure, using load balancers, floating IP addresses, redundancy and more.

Linux is much more popular for serving web apps than Windows. Additionally, Python and Django work together very well with Linux, and not so well with Windows.

There are many reasons for choosing DigitalOcean as a cloud provider, especially for small projects that will be deployed on a single droplet (a virtual server in DigitalOcean terminology). DigitalOcean is a great solution for software projects and startups which start small and scale up step by step. Read more about my comparison between DigitalOcean and Amazon Web Services in terms of an early-stage startup software project.

There are some best practices for setting up your Django project I highly recommend you to follow before starting the deployment process. The best practices include working with a virtual environment, exporting requirements.txt file and configuring the settings.py file for working with multiple environments.

django best practices

This post will cover the deployment process of a Django project from A to Z on a brand-new Linux Ubuntu server. Feel free to choose your favorite cloud provider other than DigitalOcean for deployment.

As aforesaid, the built-in development server of Django is weak and is not built for scale. You can use it for developing your Django project yourself or share it with your co-workers, but not more than that. In order to serve your app in a production environment, we need to use several components that will talk to each other and make the magic happen. Hosting a web application usually requires the orchestration of three actors:

  1. Web server
  2. Gateway
  3. Application

The web server

The web server receives an HTTP request from the client (the browser) and is usually responsible for load balancing, proxying requests to other processes, serving static files, caching and more. The web server usually interprets the request and sends it to the gateway. Common web server and Apache and Nginx. In this tutorial, we will use Nginx (which is also my favorite).

The Gateway

The gateway translates the request received from the web server so the application can handle it. The gateway is often responsible for logging and reporting as well. We will use Gunicorn as our Gateway for this tutorial.

The Application

As you may already guess, the application refers to your Django app. The app takes the interpreted request, process it using the logic you implemented as a developer, and returns a response.


Assuming you have an existing ready-for-deployment Django project, we are going to deploy your project by following these steps:

  1. Creating a new DigitalOcean droplet
  2. Installing pre requisites: pip, virtual environment, and git
  3. Pulling the Django app from Git
  4. Setting up PostgreSQL
  5. Configuring Gunicorn with Supervisor
  6. Configuring Nginx for listening to requests
  7. Securing your deployed app: setting up firewall

Creating a droplet

A droplet in DigitalOcean refers to a virtual Linux server with CPU, RAM and disk space. The first step in this tutorial is about creating a new droplet and connect to it via SSH. Assuming your local machine is running Ubuntu, we are going to create a new SSH key pair in order to easily and securely connect to our droplet once it is created. Connection using SSH keys (rather than a password) is both more simple and secure. If you already have an SSH key pair, you can skip the creation process. On your local machine, enter in the terminal:

$ ssh-keygen -t rsa

You should get two more questions, where to locate the keys (the default is fine) and whether you want to set up a password (not essential).

Now the key pair is located in:

/home/user/.ssh/

where id_rsa.pub is your public key and id_rsa is your private key. In order to use the key pair to connect to a remote server, the public key should be located on the remote server and the private key should be located on your local machine.

Notice that the public key can be located on every remote server you wish to connect to. But, the private key must be kept only on your local machine! Sharing the private key will enable other users to connect to your server.

After signing up with DigitalOcean, open the SSH page and click on the Add SSH Key button. In your terminal copy the newly-created public key:

$ cat /home/user/.ssh/id_rsa.pub

Enter the new public key you generated and name it as you wish.

SSH key

Now once the key is stored in your account, you can assign it with every droplet you create. The droplet will contain the key so you connect to it from your local machine, while password authentication will be disabled by default, which is highly recommended.

Now we are ready to create our droplet. Click on “Create Droplet” at the top bar of your DigitalOcean dashboard.

create droplet

Choose Ubuntu 16.04 64bit as your image, droplet size which is either 512MB RAM or 1GB, whatever region that makes sense to you.

 

image distro

droplet size

droplet region

You can select the private networking feature (which is not essential for this tutorial). Make sure to select the SSH key you’ve just added to your account. Name your new droplet and click “Create”.

private networking

select ssh keys

create droplet

Once your new droplet has been created, you should be able to connect to it easily using the SSH key you created. In order to do that, copy the IP address of your droplet from your droplets page inside your dashboard, go to your local terminal and type:

$ ssh root@IP_ADDRESS_COPIED

Make sure to replace withIP_ADDRESS_COPIED your droplet’s IP address. You should be already connected by now.

Tip for advanced users: in case you want to configure an even simpler way to connect, add an alias to your droplet by editing the file:

$ nano /home/user/.ssh/config

and adding:

Host remote-server-name 
    Hostname DROPLET_IP_ADDRESS 
    User root

Make sure to replace remote-server-name with a name of your choice, and DROPLET_IP_ADDRESS with the IP address of the server.

Save the file by hitting Ctrl+O and then close it with Ctrl+X. Now all you need to do in order to connect to your droplet is typing:

$ ssh remote-server-name

That simple.

Installing prerequisites

Once connected to your droplet, we are going to install some software in order to start our deployment process. Start by updating your repositories and installing pip and virtualenv.

$ sudo apt-get update $ sudo apt-get install python-pip python-dev build-essential libpq-dev postgresql postgresql-contrib nginx git virtualenv virtualenvwrapper $ export LC_ALL="en_US.UTF-8" $ pip install --upgrade pip $ pip install --upgrade virtualenv

Hopefully, you work with a virtual environment on your local machine. In case you don’t, I highly recommend you reading my best practices post for setting up a Django project in order to realize why working with virtual environments is an essential part of your Django development process.

Let’s get to configuring the virtual environment. Create a new folder with:

$ mkdir ~/.virtualenvs 
$ export WORKON_HOME=~/.virtualenvs

Configure the virtual environment wrapper for easier access by running:

$ nano ~/.bashrc

and adding this line to the end of the file:

. /usr/local/bin/virtualenvwrapper.sh

Tip: use Ctrl+V to scroll down faster, and Ctrl+Y to scroll up faster inside the nano editor.

Hit Ctrl+O to save the file and Ctrl+X to close it. In your terminal type:

$ . .bashrc

Now you should be able to create your new virtual environment for your Django project:

$ mkvirtualenv virtual-env-name

From within your virtual environment install:

(virtual-env-name) $ pip install django gunicorn psycopg2

Tip: Useful command for working with your virtual environment:

$ workon virtual-env-name # activate the virtual environment 
$ deactivate # deactivate the virtual environment

Pulling application from Git

Start by creating a new user that will hold your Django application:

$ adduser django 
$ cd /home/django 
$ git clone REPOSITORY_URL

Assuming your code base is already located in a Git repository, just type your password and your repository will be cloned into your remote server. You might need to add permissions to execute manage.py by navigating into your project folder (the one you’ve just cloned) and type:

$ chmod 755 ./manage.py

In order to take the virtual environment one step further in terms of simplicity, copy the path of your project’s main folder to the virtual environment settings by typing:

$ pwd > /root/.virtualenvs/virtual-env-name/.project

Make sure to replace virtual-env-name with the real name of your virtual environment. Now, once you use the workon command to activate your virtual environment, you’ll be navigated automatically to your project’s main path.

In order to setup the the environment variable properly, type:

$ nano /root/.virtualenvs/virtual-env-name/bin/postactivate # replace virtual-env-name with the real name

and add this line to the file:

export DJANGO_SETTINGS_MODULE=app.settings

Make sure to replace app.settings with the location of your settings module inside your Django app. Save and close the file.

Assuming you’ve set up your requirements.txt file as described in the Django best practices post, you’re now able to install all your requirements at once by navigating to the path of the requirements.txt file and run from within your virtual environment:

(virtual-env-name) $ pip install -r requirements.txt

Setting up PostgreSQL

Assuming you’ve set up your settings module as described in the Django best practices post, you should have by now a separation between the development and production settings files. Your production.py settings file should contain PostgreSQL connection settings as well. If it doesn’t, add to the file:

DATABASES = { 
    'default': { 
        'ENGINE': 'django.db.backends.postgresql', 
        'NAME': 'app_db', 
        'USER': 'app_user', 
        'PASSWORD': 'password', 
        'HOST': 'localhost', 
        'PORT': '5432', 
    } 
}

I highly recommend updating and pushing the file on your local machine and pulling it from the remote server using the repository we cloned.

Let’s get to creating the production database. Inside the terminal, type:

$ sudo -u postgres psql

Now you should be inside PostgreSQL terminal. Create your DB and user with:

> CREATE DATABASE app_db; 
> CREATE USER app_user WITH PASSWORD 'password'; 
> ALTER ROLE app_user SET client_encoding TO 'utf8'; 
> ALTER ROLE app_user SET default_transaction_isolation TO 'read committed'; 
> ALTER ROLE app_user SET timezone TO 'UTC'; 
> ALTER USER app_user CREATEDB; 
> GRANT ALL PRIVILEGES ON DATABASE app_db TO app_user;

Make sure your details here match the production.py settings file DB configuration as described above. Exit the PostgreSQL shell by typing \q.

Now you should be ready to run migrations command on the new DB. Assuming all of your migrations folders are in the .gitignore file, meaning they are not pushed into the repository, your migrations folders should be empty. Therefore, you can set up the DB by navigating to your main project path with:

(virtual-env-name) $ cdproject

and then run:

(virtual-env-name) $ python ./manage.py migrate
(virtual-env-name) $ python ./manage.py makemigrations
(virtual-env-name) $ python ./manage.py migrate

Don’t forget to create yourself a superuser by typing:

(virtual-env-name) $ python ./manage.py createsuperuser

Configuring Gunicorn with Supervisor

Now once the application is set up properly, it’s time to configure our gateway for sending requests to our Django application. We will use Gunicorn as our gateway, which is commonly used.

Start by navigating to your project’s main path by typing:

(virtual-env-name) $ cdproject

First, we will test gunicorn by typing:

(virtual-env-name) $ gunicorn --bind 0.0.0.0:8000 app.wsgi:application

Make sure to replace app with your app’s name. Once gunicorn is running your application, you should be able to access http://IP_ADDRESS:8000 and see your application in action.

When you’re finished testing, hit Ctrl+C to stop gunicorn from running.

Now it’s time to operate gunicorn from a service to make sure it’s running continuously. Rather than setting up a systemd service, we will use a more robust way with Supervisor. Supervisor, as the name suggests, is a great tool for monitoring and controlling processes. It helps you understand better how your processes operate.

To install supervisor, type outside of your virtual environment:

$ sudo apt-get install supervisor

Once supervisor is running, every .conf file that is included in the path:

/etc/supervisor/conf.d

represents a monitored process. Let’s add a new .conf file to monitor gunicorn:

$ nano /etc/supervisor/conf.d/gunicorn.conf

and add into the file:

[program:gunicorn] 
directory=/home/django/app-django/app 
command=/root/.virtualenvs/virtual-env-name/bin/gunicorn --workers 3 --bind unix:/home/django/app-django/app/app.sock app.wsgi:application 
autostart=true 
autorestart=true 
stderr_logfile=/var/log/gunicorn/gunicorn.out.log 
stdout_logfile=/var/log/gunicorn/gunicorn.err.log 
user=root 
group=www-data 
environment=LANG=en_US.UTF-8,LC_ALL=en_US.UTF-8 

[group:guni] 
programs:gunicorn

Make sure that all the references are properly configured. Save and close the file.

Now let’s update supervisor to monitor the gunicorn process we’ve just created by running:

$ supervisorctl reread 
$ supervisorctl update

In order to validate the process integrity, use this command:

$ supervisorctl status

By now, gunicorn operates as an internal process rather than a process that can be accessed by users outside the machine. In order to start sending traffic to gunicorn and then to your Django application, we will set up Nginx the serve as a web server.

Configuring Nginx

Nginx is one of the most popular web servers out there. The integration between Nginx and Gunicorn is seamless. In this section, we’re going to set up Nginx to send traffic to Gunicorn. In order to do that, we will create a new configuration file (make sure to replace app with your own app name):

$ nano /etc/nginx/site-available/app

then edit the file by adding:

server { 
    listen 80; 
    server_name SERVER_DOMAIN_OR_IP; 
    location = /favicon.ico { access_log off; log_not_found off; } 
    location /static/ { 
        root /home/django/app-django/app; 
    } 
    location / { 
        include proxy_params; 
        proxy_pass http://unix:/home/django/app-django/app/app.sock; 
    } 
}

This configuration will proxy requests to the appropriate route in your server. Make sure to set all the references properly according to Gunicorn and to your app configurations.

Initiate a link with:

$ ln -s /etc/nginx/sites-available/app /etc/nginx/sites-enabled

Check Nginx configuration by running:

$ nginx -t

Assuming all good, restart Nginx by running:

$ systemctl restart nginx

By now you should be able to access your server only by typing your IP in the browser because Nginx listens on port 80 which is the default port browsers use.

Security

Well done! You should have a deployed Django app by now! Now it’s time to secure the app to make sure it’s much difficult to hack it. In order to do that, we will use ufw built-in Linux firewall.

ufw works by configuring rules. Rules tell the firewall which kind of traffic it should accept or decline. At this point, there are two kinds of traffic we want to accept, or in other words, two ports we want to open:

  1. port 80 for listening to incoming traffic via browsers
  2. port 22 to be able to connect to the server via SSH.

Open the port by typing:

$ ufw allow 80 
$ ufw allow 22

then enable ufw by typing:

$ ufw enable

Tip: before closing the terminal, make sure you are able to connect via SSH from another terminal to so you’re not locked outside your droplet due to bad configurations of the firewall.

What to do next?

This post is the ultimate guide to deploy a Django app on a single server. In case you’re developing an app that should serve larger amounts of traffic, I suggest you look into highly scalable server architecture. You can start with my post about how to design a high-availability server architecture.

3 best practices for better setting up your Django project

3 best practices for better setting up your Django project

Django is a robust open source Python-based framework for building web applications. Django has gained an increase in its popularity during the last couple of years, and it is already mature and widely-used with a large community behind it. Among other Python-based frameworks for creating web applications (Like Flask and Pyramid), Django is by far the most popular. It supports both Python version 2.7 and Python 3.6 but as for the time of this article being written, Python 2.7 is still the more accessible version in terms of community, 3rd party packages, and online documentation. Django is secured when used properly and provides high dimensions of flexibility, therefore is the way to go when developing server-side applications using Python.

In this article, I will share with you best practices of a Django setup I’ve learned and collected over the recent years. Whether you have a few Django projects under your belt, or you’re just about to start your first Django project from scratch, the collection described here might help you create better applications down the road. The article has been written from a very practical mindset so you can add some tools to your development toolbox immediately, or even create yourself an advanced custom Django boilerplate for your next projects.

* In this article I assume you’re using a Linux Ubuntu machine.

Virtual Environment

While developing Python-based applications, using 3rd party packages are an ongoing thing. Typically, these packages are being updated often so keeping them organized is essential. When developing more and more different projects on the same local machine, it’s challenging to keep track on the current version of each package, and impossible to use different versions of the same package for different projects. Moreover, updating a package on one project might break functionality on another, and vice versa. That’s where Python Virtual Environment comes handy. To install virtual environment use:

$ apt-get update
$ apt-get install python-pip python-dev build-essential

$ export LC_ALL="en_US.UTF-8" # might be necessary in case you get an error from the next line

$ pip install --upgrade pip
$ pip install --upgrade virtualenv
$ mkdir ~/.virtualenvs
$ pip install virtualenvwrapper
$ export WORKON_HOME=~/.virtualenvs
$ nano ~/.bashrc

add this line to the end of the file:

. /usr/local/bin/virtualenvwrapper.sh

then execute:

$ . .bashrc

After installing, create a new virtual environment for your project by typing:

$ mkvirtualenv project_name

While you’re in the context of your virtual environment you’ll notice a prefix that is being added to the terminal, like:

(project_name) ofir@playground:~$

In order to deactivate (exit) the virtual environment and getting back to the main Python context of your local machine, use:

$ deactivate

In order to activate (start) the virtual environment context, use:

$ workon project_name

To list the virtual environments exist in your local machine, use:

$ lsvirtualenv

Holding your project dependencies (packages) in a virtual environment on your machine allows you to keep them in an isolated environment and only use them for a single (or multiple) projects. When creating a new virtual environment you’re starting a fresh environment with no packages installed in it. Then you can use, for example:

(project_name) $ pip install Django

for installing Django in your virtual environment, or:

(project_name) $ pip install Django==1.11

for installing version 1.11 of Django accessible only from within the environment.

Neither your main Python interpreter nor the other virtual environments on your machine will be able to access the new Django package you’ve just installed.

In order to use the runserver command using your virtual environment, while in the context of the virtual environment, use:

(project_name) $ cd /path/to/django/project
(project_name) $ ./manage.py runserver

Likewise, when entering the Python interpreter from within the virtual environment by typing:

(project_name) $ python

it will have access to packages you’ve already installed inside the environment.

Requirements

Requirements are the list of Python packages (dependencies) your project is using while running, including version for each package. Here’s an example for a requirements.txt file:

dicttoxml==1.7.4
Django==1.11.2
h5py==2.7.0
matplotlib==2.0.2
numpy==1.13.0
Pillow==4.1.1
psycopg2==2.7.1
pyparsing==2.2.0
python-dateutil==2.6.0
pytz==2017.2
six==1.10.0
xmltodict==0.11.0

Keeping your requirements.txt file up to date is essential for collaborating properly with other developers, as well as keeping your production environment properly configured. This file, when included in your code repository, enables you to update all the packages installed in your virtual environment by executing a single line in the terminal, and by that to get new developers up and running in no time. In order to generate a new requirements.txt or to update an existing one, use from within your virtual environment:

(project_name) $ pip freeze > requirements.txt

For your convenience, make sure to execute this command in a folder that is being tracked by your Git repository so other instances of the code will have access to the requirements.txt file as well.

Once a new developer is joining the team, or you want to configure a new environment using the same packages listed in the requirements.txt file, execute in the virtual environment context:

(project_name) $ cd /path/to/requirements/file
(project_name) $ pip install -r requirements.txt

All requirements listed in the file will immediately be installed in your virtual environment. Older versions will be updated and newer versions will be downgraded to fit the exact list of requirements.txt. Be careful though, because there might be differences sometimes between different environments that you still want to respect.

I highly recommend integrating these commands to your work flow: updating the requirements.txt file before pushing code to the repository and installing requirements.txt file after pulling code from the repository.

Better settings.py Configuration

Django comes out-of-the-box with very basic yet useful settings.py file, defines the main and most useful configurations for your project. The settings.py file is very straightforward, but sometimes, as a developer working in a team, or when settings up a production environment, you often need more than one basic settings.py file.

Multiple settings files allow you to easily define tailor-made configurations for each environment separately like:

ALLOWED_HOSTS # for production environment
DEBUG
DATABASES # for different developers on the same team

Let me introduce you to an extended approach for configuring your settings.py file which allows you to easily maintain different versions and use the one you want in any given time and environment in no time.

First, navigate to your settings.py file path:

(project_name) $ cd /path/to/settings/file

Then create a new module called settings (module is a folder containing an __init__.py file):

(project_name) $ mkdir settings

Now, rename your settings.py file to base.py and place it inside the new module you created:

(project_name) $ mv settings.py settings/base.py

For this example, I assume that you want to configure one settings file for your development environment and one for your production environment. You can use the exact same approach for defining different settings files for different developers in the same team.

For your development environment create:

(project_name) $ nano settings/development.py

Then type:

from .base import *

DEBUG = True

and save the file by hitting Ctrl + O, Enter and then Ctrl + X.

For your production environment create:

(project_name) $ nano settings/production.py

and type:

from .base import *

DEBUG = False
ALLOWED_HOSTS = [‘app.project_name.com’, ]

Now, whenever you want to add or update settings of a specific environment you can easily do it in its own settings file. The last question that should be asked is how Django knows which settings file to load on each environment? And the answer is: that’s what the __init__.py file is used for. When Django looks for the settings.py it used to load when running the server, for example, it now finds a settings module rather than a settings.py file. But as long as it’s a module containing an __init.py__ file, as far as Django is concerned, it’s the exact same thing. Django will load the __init__.py file and execute whatever written in it. Therefore, we need to define which settings file we want to load inside the __init__.py file, by executing:

(project_name) $ settings/__init__.py

and then, for a production environment, for example, typing:

from .production import *

This way, Django will load all the base.py and production.py settings every time it starts. Magic?

Now, the only configuration left is to keep the __init__.py in your .gitignore file so it will not be included in pushes and pulls. Once you set up a new environment, don’t forget to create a new __init__.py file inside the settings module and import the settings file required exactly like we did before.

In this article we’ve covered three best practices for better setting up your Django project:

  • Working inside a virtual environment
  • Keeping requirements.txt file up to date and use it continuously in your work flow
  • Setting up a better project settings array.

This is part 1 in the series about best practices for Django development. Follow me to get an immediate update once the next parts will be available.

Have you followed these best practices in your last project? Do you have any insights to share? Comments are highly appreciated.