Apr 29, 2020

Python Collections Counter, Defaultdict

1) Counter

from collections import Counter
myList = [10,10,20,30,4,5,3,2,3,4,2,1,2,30]
print(Counter(myList))
# Counter({2: 3, 10: 2, 30: 2, 4: 2, 3: 2, 20: 1, 5: 1, 1: 1})

print(Counter(myList).items())
# dict_items([(10, 2), (20, 1), (30, 2), (4, 2), (5, 1), (3, 2), (2, 3), (1, 1)])

print(Counter(myList).keys())
# dict_keys([10, 20, 30, 4, 5, 3, 2, 1])

print(Counter(myList).values())
# dict_values([2, 1, 2, 2, 1, 2, 3, 1])

2) defaultdict

from collections import defaultdict

a)
d = defaultdict(list)

# Even if key not exists it defaults to list/int
d['python'].append("awesome")
d['others'].append("not relevant")
d['python'].append("language")
d['test']

for i in d.items():
    print(i)

# O/P:
# ('python', ['awesome', 'language'])
# ('others', ['not relevant'])
# ('test', [])


b)
d = defaultdict(int)
d['without_val']
d['with_val'] = 100
for i in d.items():
    print(i)

# O/P:
# ('without_val', 0)
# ('with_val', 100)

c)
demo = defaultdict(int)
print(demo[300]) # 0


Apr 28, 2020

Git How to Revert a commit and push to new branch

Say you committed files to a wrong branch
How to revert and push new branch?

Existing Branch
git log --oneline
check for commit id: E.g., e0db5f7
git revert <commit_id>
git push

Create a new branch
git checkout -b new_branch
git checkout <commit_id> .    #this will copy commit files to local branch
git commit
git push


Apr 25, 2020

How to Configure Zookeeper and Kafka?

How to Configure Kafka?

# Download Kafka

# Kafka ENV  
  • export KAFKA_HOME=$HOME/Workspace/prabhath/personal/kafka_2.12-2.5.0 
  • export PATH=$KAFKA_HOME/bin:$PATH
Zookeeper config:
  • bin/zookeeper-server-start.sh
  • bin/zookeeper-server-stop.sh
  • config/zookeeper.properties --> Default port: 2181, dataDir: /tmp/zookeeper

Kafka Config:
  • bin/kafka-server-start.sh
  • bin/kafka-server-stop.sh
  • config/server.properties --> Default port: 9092

1) Start zookeeper
  • zookeeper-server-start.sh $KAFKA_HOME/config/zookeeper.properties

2) Start Kafka server
  • kafka-server-start.sh $KAFKA_HOME/config/server.properties

3) Create a Kafka topic
  • kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic first_kafka_topic
  • kafka-topics.sh --list --zookeeper localhost:2181 consumer_offsets
  • It lists first_kafka_topic

4) Start Kafka Producer
  • kafka-console-producer.sh --broker-list localhost:9092 --topic first_kafka_topic
  • <start typing data>

5) Start Kafka Consumer
  • kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic first_kafka_topic --from-beginning


Apr 19, 2020

Python collections namedtuple

import collections

fields = ['OBJECTID', 'Identifier', 'Occurrence_Date', 'Day_of_Week', 'Occurrence_Month', 'Occurrence_Day', 'Occurrence_Year', 'Occurrence_Hour', 'CompStat_Month', 'CompStat_Day', 'CompStat_Year', 'Offense', 'Offense_Classification', 'Sector', 'Precinct', 'Borough', 'Jurisdiction', 'XCoordinate', 'YCoordinate', 'Location_1']

Crime = collections.namedtuple('Crime', fields)

row1_value = ['1', 'f070032d', '09/06/1940 07:30:00 PM', 'Friday', 'Sep', '6', '1940', '19', '9', '7', '2010', 'BURGLARY', 'FELONY', 'D', '66', 'BROOKLYN', 'N.Y. POLICE DEPT', '987478', '166141', '(40.6227027620001, -73.9883732929999)']

row1_obj = Crime(*row1_value)

print(row1_obj)


Output:
Crime(OBJECTID='1', Identifier='f070032d', Occurrence_Date='09/06/1940 07:30:0
0 PM', Day_of_Week='Friday', Occurrence_Month='Sep', Occurrence_Day='6', Occur
rence_Year='1940', Occurrence_Hour='19', CompStat_Month='9', CompStat_Day='7',
 CompStat_Year='2010', Offense='BURGLARY', Offense_Classification='FELONY', Se
ctor='D', Precinct='66', Borough='BROOKLYN', Jurisdiction='N.Y. POLICE DEPT', 
XCoordinate='987478', YCoordinate='166141', Location_1='(40.6227027620001, -73
.9883732929999)')