Jun 25, 2015

Python Spawn New Process SubProcess don't wait

Spawn New Process:
You can spawn a new process within an another process
You can make use of subprocess to start a new process

proc = Popen([python process_doc.py], shell=True, stdin=None, stdout=None, stderr=None, close_fds=True)

close_fds makes the parent process file handles inaccessible for the child

Once it hits Popen, it startes daemon process and coninues execution (parallely daemon process will coninue executing)

Example:
.....
print "Before Daemon Process to Start"
proc = Popen([python process_doc.py], shell=True, stdin=None, stdout=None, stderr=None, close_fds=True)
print "After Daemon Process to Start"

.....

In the above example, the parent process don't wait for child process (process_doc.py) to wait till it executes completely. It spawns new daemon process and continues.
This daemon process executes separately, finish the task and dies.

Run command in Shell:
cmd = 'python /tmp/test.py'
subprocess.call(cmd, shell=True) #Synchronous
subprocess.Popen(cmd, shell=True, executable='/bin/bash') #Asynchronous ****

Python Asynchronous


Real Time Scenario:

In general, we process data synchrnously by default
To improve performance, we can separate data processing separately (if no dependency) by spawing a new process and use multi-threading or multi-tasking

E.g.,
Suppose, you have a webpage, on submit you have to upload many images/documents to remote location (Synchronously one by one it takes more time). When hundreds of requests keep coming, its difficult for the server to handle 

How can we achieve better results in Asynchronous way:
Once you select images/documents to upload and submit
Spawn a new process to start the processing of uploading images (this processing can be multi-threaded, multi-processing)
Keep updating the progress of the status to Database (1 of 5 completed, 2 of 5 completed etc..)
Browser keep checking for the status of the Database
Once the status is SUCCESS, initimate the user with Success message

Spawn New Process:
You can spawn a new process within an another process
You can make use of subprocess to start a new process

proc = Popen([python process_doc.py], shell=True, stdin=None, stdout=None, stderr=None, close_fds=True)

close_fds makes the parent process file handles inaccessible for the child

Once it hits Popen, it startes daemon process and coninues execution (parallely daemon process will coninue executing)

JQuery/JavaScript to keep checking Server for status:
function runner() {
     setTimeout(function() {
        $.ajax({
             url : "AJAX_POST_URL",
             type: "POST",
            data : formData,
            success: function(data, textStatus) {
                    if (data.status != 'Success') {
                        runner()
                    } else if  (data.status == 'Success') {
                        alert("Successfully Uploaded") 
                    }      
            },
        });
    }, time);
 }

runner();
 

Jun 24, 2015

Python/Django Html to Text Conversion

Html to Text Conversion in Python (Using BeautifulSoup)

import urllib
url = "<url>"
html = urllib.urlopen(url).read()
soup = BeautifulSoup(html)

# kill all script and style elements
for script in soup(["script", "style"]):
    script.extract()    # rip it out

# get text
text = soup.get_text()

print text

Html to Text Conversion in Django (strip_tags)

from django.utils.html import strip_tags
import urllib

url = "<url>"

html = urllib.urlopen(url).read()
text = strip_tags(html)

print text


Jun 14, 2015

Django access remote application from your system

Assume you have two systems with Django applications installed

Suppose
System1 IP is : 192.10.10.111
System2 IP is : 192.10.10.222 

Suppose you are using System1, and you want to also access the System2 application (for Testing)
Instead of going to System2, you can very well configure host settings to access System2 machine from your own Desktop/Machine

Say the application name (ServerName) for System1 is app111.com
Say the application name (ServerName) for System2 is app222.com

Configure Hosts File in Windows System (192.10.10.111):

The Hosts file in Windows is located at the following location:
C:\Windows\System32\drivers\etc

Suppose your system IP is: 192.10.10.111

Add the following to the Hosts File:
192.10.10.111    app111.com
192.10.10.222   app222.com

When you access app222.com in your browser, it then points to 192.10.10.222 for accessing the application.

Similarly you can make use of hosts file to access the remote applications at your ease.

You might also be interested in reading:
Django Application with Remote Database


Django Application with remote database

Assume you want to access remote database on your Django application

Suppose
Your system IP is             : 192.10.10.111  (or) localhost
Your friends system IP is : 192.10.10.222 

Run your Django application with local database:
DATABASES = {
    'default': {
        'HOST': '192.10.10.111 ',    #(or) localhost
        'ENGINE': 'django.db.backends.mysql',
        'NAME': '111_test_db' ,
        'USER': '111_user',
        'PASSWORD': 'XXX'
    }
}

Run your Django application with remote database:
DATABASES = {
    'default': {
        'HOST': '192.10.10.222',
        'ENGINE': 'django.db.backends.mysql',
        'NAME': '222_test_db',
        'USER':'222_user',
        'PASSWORD':'YYY'
    }
}

When accessing remote database for yout django application, you need Grant Permissions on Remote Database for initiating HOST user

E.g., In the remote database '222_test_db'  (Login)
grant all privileges on *.* to 111_user@192.10.10.111 identified by 'XXX' with grant option;

The above command gives/grants enough permissions to the 111_user to access 222_test_db, else "Permission Denied" error is shown.


Reload Apache:
service httpd restart

How to run multiple sites on one Apache

Assuming we have two sites to configure in Apache
www.example1.com
www.example2.com

/etc/httpd/conf.d/example1_http.conf
<VirtualHost *>
        ServerAdmin webmaster@example1.com
        ServerName  www.example1.com
        ServerAlias example1.com

        # Indexes + Directory Root.
        DirectoryIndex index.html
        DocumentRoot /home/www/www.example1.com/htdocs/
.....
</VirtualHost>

/etc/httpd/conf.d/example2_http.conf
<VirtualHost *>
        ServerAdmin webmaster@example2.com
        ServerName  www.example2.com
        ServerAlias example2.com

        # Indexes + Directory Root.
        DirectoryIndex index.html
        DocumentRoot /home/www/www.example2.com/htdocs/
.....
</VirtualHost>

Configure Hosts File in Windows:

The Hosts file in Windows is located at the following location:
C:\Windows\System32\drivers\etc

Suppose your system IP is: 192.55.44.55

Add the following to the Hosts File:
192.55.44.55 example1.com
192.55.44.55 example2.com
192.55.44.55 www.example1.com
192.55.44.55 www.example2.com

Restart Apache

Now your Apache runs the two multiple sites

You can access example1.com and example2.com respectively

Access Default Site when Server Name/Alias mismatches

When you have multiple sites (multiple conf files) in conf.d, you can create a default conf file

If Apache finds difficulty in finding the site, it defaults to
The following defaults to www.example1.com in case of any conflicts

/etc/httpd/conf.d/aaa_http.conf
<VirtualHost *>
        ServerAdmin webmaster@example1.com
        ServerName  www.example1.com
        ServerAlias example1.com

        # Indexes + Directory Root.
        DirectoryIndex index.html
        DocumentRoot /home/www/www.example1.com/htdocs/
.....
</VirtualHost>

How to simulate the conflict state where Apache not able to find the correct site conf file

For two different sites, give the same ServerName
/etc/httpd/conf.d/test1_http.conf
          ServerName  www.test.com
/etc/httpd/conf.d/test2_http.conf
          ServerName  www.test.com

If you try to access www.test.com, Apache gets CONFUSED to pick which sites conf file (whether to choose test1_http.conf or test2_http.conf)

So Apache picks the top conf file (as per naming order) defined above aaa_http.conf
So you are re-directed to www.example1.com (as defined in aaa_http.conf  

 

Load Testing Apache with AB (Apache Bench)

ab - Apache HTTP server benchmarking tool

  • ab is a tool for benchmarking your Apache Hypertext Transfer Protocol (HTTP) server. 
  • It is designed to give you an impression of how your current Apache installation performs. 
  • This especially shows you how many requests per second your Apache installation is capable of serving.

-n => No.of requests
Number of requests to perform for the benchmarking session. The default is to just perform a single request which usually leads to non-representative benchmarking results.

-c => Concurrency
Number of multiple requests to perform at a time. Default is one request at a time.

-k => Keep Alive
Enable the HTTP KeepAlive feature, i.e., perform multiple requests within one HTTP session. Default is no KeepAlive.
KeepAlive header, which asks the web server to not shut down the connection after each request is done, but to instead keep reusing it.

-q => Supress messages
When processing more than 150 requests, ab outputs a progress count on stderr every 10% or 100 requests or so. The -q flag will suppress these messages.

 -r => Don't exit on socket receive errors.

E.g.,
 ab -n 1000 -c 100 http://test.xyz.com/

Sample Output:
Benchmarking test.xyz.com (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Completed 600 requests
Completed 700 requests
Completed 800 requests
Completed 900 requests
Finished 1000 requests

Server Software:        Apache/2.2.15
Server Hostname:        test.xyz.com
Server Port:            80

Document Path:          /
Document Length:        29089 bytes

Concurrency Level:      450
Time taken for tests:   9.537570 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      29537664 bytes
HTML transferred:       29295116 bytes
Requests per second:    104.85 [#/sec] (mean)
Time per request:       4291.906 [ms] (mean)
Time per request:       9.538 [ms] (mean, across all concurrent requests)
Transfer rate:          3024.36 [Kbytes/sec] received


AWS Route 53 Configuration

Using CLI53 Tool - Export from GoDaddy & Import Zone File to Route53



cli53 is a Python script for managing Amazon Route 53.
It's freaking awesome as it can import, export and help you debug your Route 53 setup.

Ref:

1) Godadd'y DNS Manager you can export your domain's setup:
export file <todoist.com.zone>

2) pip install cli53

3) Simply add $ORIGIN yourdomain.com. at the top of the Godaddy's export file
from Step 1
$ORIGIN yourdomain.com.
e.g., $ORIGIN todoist.com.

4) Import data with cli53
- Create a hosted Zone : cli53 create todoist.com
- Import your Godaddy export : cli53 import todoist.com --file <todoist.com.zone>

5)  Use the excellent "Interstate53" service to recheck that everything is setup correctly
     using a web interface.
      Or
      just use "cli53 rrlist command"
     e.g, cli53 rrlist todoist.com

6) Debug and re-check everything
For GoDaddy:
dig -t any @PDNS01.DOMAINCONTROL.COM todoist.com

For Route53:
dig -t any @ns-158.awsdns-19.com todoist.com

7) Update your domain's nameservers
- The last step is to update your domain's nameservers to point to Route 53

8) Transfer your domain from GoDaddy

1. How to unlock your domain with GoDaddy:
- Login to your GoDaddy account
- Next to Domains click on Manage
- Select the domain(s) to unlock and click Lock
- Select Off radio-button to have the domain(s) unlocked and click Save
2. How to obtain EPP/Authorization code from GoDaddy:
- Login to your GoDaddy account
- Next to Domains click on Launch
- Click on the domain you need EPP code
- Click Email my code in the Authorization Code field - send

3. You would also need to disable privacy protection service for the domain (if it's enabled).
- Go to the Domains By Proxy website and login to your account (the login details are not the same as for your GoDaddy account. Look for the email sent by support@domainsbyproxy.com for the login details)
- Select a check box next to the domain name(s) you need to disable privacy protection for
- Click Cancel Selected, OR click on the cancellation icon next to the domain name
- In "Confirm" window click OK

4. How to accept the transfers at GoDaddy.
- Log in to your Account Manager
- Next to Domains, click Manage,
- From the Domains menu, select Transfers
- Click on Pending Transfers Out, select the domain name(s) you are transferring from GoDaddy and click on Accept/ Decline above
- Select Accept and click OK. The request will be processed within 15 minutes.


Using Route53 Console UI - Export from GoDaddy & Import Zone File to Route53

  • Get a zone file from the DNS service provider that is currently servicing the domain. The process and terminology vary from one service provider to another. Refer to your provider's interface and documentation for information about exporting or saving your records in a zone file or a BIND file.

If the process isn't obvious, try asking your current DNS provider's customer support for your records list or zone file information.
  • Click Create Hosted Zone.
  • Enter the name of your domain and, optionally, a comment. Note that the comment can't be edited later.
  • Click Create.
  • On the Hosted Zones page, double-click the name of your new hosted zone.
  • Click Import Zone File.
  • In the Import Zone File pane, paste the contents of your zone file into the Zone File text box.
  • Click Import.