0%

Killer robots are everywhere in your future. The main character in The Matrix, Neo , is also one of them. We are getting closer to a reality that we all desperately hope does not exist. Computer vision and machine learning will soon allow us to create new human like creatures that will be used as weapons of destruction. The most important thing in the race to build a truly human robot is an unbreakable brain, because otherwise the machine will never be able to escape from the cage of its programming.

There have been news reports about robots being used for suicide bombing. They are being used by terrorist groups such as the Islamic State. There are also rumors that there is a robot that is being used by governments for murder.

The idea that deep learning can be used for assassination is disturbing. As a researcher, how would you feel if someone used your work for assassination. Any known way to prevent it would be to make a super intelligent robot that could prevent it from ever doing it.

THE ABOVE TEXT IS GENERATED USING GPT2. CHECK BOLD LETTERS IN FOLLOWING IMAGE TO SEE WHAT IS AUTO-GENERATED. (model size: gpt2/large)

Share your thoughts in comments.

  1. Check I/O activity: Whether you’re referred to as a security paranoid or a someone whose disk LED never goes off, iotop command can help. It displays top processes reading or writing to disk.

sudo iotop

  1. Check why boot is slow: systemd-analyze blame displays a list of processes to be “blamed” for slow boot. Also try bootchart

sudo systemd-analyze blame

  1. See what’s crashing: Annoyed of seeing “System program problem detected” popus, check /var/crash/ directory. You can remove its contents if you don’t want to dig into the issues nor want to report them.

ls /var/crash/

  1. Fasten shutdown: When shutting down the OS gives 90 seconds by default for the apps to quit, this seems to be too high! Uncomment and edit DefaultTimeoutStopSec to 30 seconds in /etc/systemd/system.conf file.

sudo nano /etc/systemd/system.conf

  1. Check scheduled jobs: See contents of /etc/cron.* to check what is running daily, weekly, or monthly. Carefully remove unwanted crons!

ls /etc/cron.\*

Happy tweaking! Don’t forget to share your tips below in comments section!

Welcome to Hadoop and BigData series! This is the first article in the series where we present an introduction to Hadoop and the ecosystem.

In the beginning

In October 2003, a paper titled Google File System (Ghemawat et al.) was published. The paper describes design and implementation of a scalable distibuted file system. This paper along with another paper on MapReduce inspired Doug Cutting and Mike Cafarella to create what is now known as Hadoop. Eventually project development was taken over by Apache Software Foundation, thus the name Apache Hadoop.

What is in the name?

The choice of name Hadoop sparks curosity, but it is not a computing jargon and there is no logic associated with the choice. Cutting couldn’t find a name for their new project, so he named it Hadoop! “Hadoop” was the name his son gave to his stuffed yellow elephant toy!

Why we need Hadoop?

When it comes to processing huge amounts (I mean really huge!) of data Hadoop is really useful. Without Hadoop, processing such huge data was only possible with specialized hardware, or call them supercomputers! The key advantage that Hadoop brings is that it runs on commodity hardware. You can actually use your wife’s and your own laptop to setup a working Hadoop cluster.

Is Hadoop free?

Hadoop is completely free, it is free as it has no price and it is free because you are free to modify it to suite your own needs. It is licensed under Apache License 2.0.

Core components of Hadoop

  1. HDFS : HDFS or Hadoop Distributed File System is the component responsible for storing files in a distributed manner. It is a robust file system which provides integrity, redundancy and other services. It has two main components : NameNode and DataNode
  2. MapReduce : MapReduce provides a programming model for parallel computations. It has two main operations : Map and Reduce. MapReduce 2.0 is sometimes referred to as YARN.

Introduction to Hadoop ecosystem

The Hadoop Ecosystem refers to collection of products which work with Hadoop. Each product carries a different task. For example, using Ambari, we can easily install and manage clusters. At this point, there is no need to dive into details of each product. All of the products shown in the image are from Apache Software Foundation and are free under Apache License 2.0.

This guide will help you to install a single node Apache Hadoop cluster on your machine.

System Requirements

  • Ubuntu 16.04
  • Java 8 Installed

1. Download Hadoop

1
wget https://archive.apache.org/dist/hadoop/core/hadoop-2.7.0/hadoop-2.7.0.tar.gz

2. Prepare for Installation

1
2
tar xfz hadoop-2.7.0.tar.gz
sudo mv hadoop-2.7.0 /usr/local/hadoop

3. Create Dedicated Group and User

1
2
3
sudo addgroup hadoop
sudo adduser --ingroup hadoop hduser
sudo adduser hduser sudo

4. Switch to Newly Created User Account

1
su -hduser

5. Add Variables to ~/.bashrc

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#Begin Hadoop Variables

export JAVA_HOME=/usr/lib/jvm/java-8-oracle
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_INSTALL=/usr/local/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"

#End Hadoop Variables

6. Source ~/.bashrc

1
source ~/.bashrc

7. Set Java Home for Hadoop

  • Open the file : /usr/local/hadoop/etc/hadoop/hadoop-env.sh
  • Find and edit the line as :
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
export JAVA_HOME=/usr/lib/jvm/java-8-oracle
```

#### 8. Edit core-site.xml

* Open the file: /usr/local/hadoop/etc/hadoop/core-site.xml
* Add following lines between _<configuration> ... </configuration>_ tags.

fs.default.name
hdfs://localhost:9000

#### 9. Edit yarn-site.xml

* Open the file: /usr/local/hadoop/etc/hadoop/yarn-site.xml
* Add following lines between _<configuration> ... </configuration>_ tags.

yarn.nodemanager.aux-services
mapreduce_shuffle
yarn.nodemanager.aux-services.mapreduce.shuffle.class
org.apache.hadoop.mapred.ShuffleHandler

1
2
3
4
5
6
7
8
9
10

#### 10. Edit mapred-site.xml

* Copy the mapred-site.xml template first using:

cp /usr/local/hadoop/etc/hadoop/mapred-site.xml.template /usr/local/hadoop/etc/hadoop/mapredsite.xml

* Open the file: /usr/local/hadoop/etc/hadoop/mapred-site.xml
* Add following lines between _<configuration> ... </configuration>_ tags.

fs.default.name
hdfs://localhost:9000

1
2
3
4

#### 11. Edit hdfs-site.xml

First, we create following directories:

sudo mkdir -p /usr/local/hadoop_store/hdfs/namenode
sudo mkdir -p /usr/local/hadoop_store/hdfs/datanode
sudo chown hduser:hadoop -R /usr/local/hadoop_store
sudo chmod 777 -R /usr/local/hadoop_store

1
2

Now open /usr/local/hadoop/etc/hadoop/hdfs-site.xml and enter the following content in between the tag <configuration></configuration>

dfs.replication
1
dfs.namenode.name.dir
file:/usr/local/hadoop_store/hdfs/namenode
dfs.datanode.data.dir
file:/usr/local/hadoop_store/hdfs/datanode

1
2

#### 12. Format NameNode

cd /usr/local/hadoop/
bin/hdfs namenode -format

1
2

#### 13. Start Hadoop Daemons

cd /usr/local/hadoop/
sbin/start-dfs.sh
sbin/start-yarn.sh

1
2
3

#### 14. Check Service Status

jps

```

15. Check Running Jobs

Type in browser’s address bar:

http://localhost:8088

Done!

This post introduces you to emerging AI technologies that will potentially lead growth of artificial intelligence applications for next two to five years.

1. Generative Adversarial Networks

Generative Adversarial Networks (GANs) use two models: a generator and a discriminator, both trained with same data. Generator generates new examples from given data and throws them to discriminator, along with some original or real examples. Discriminator classifies these samples as real or fake. Two models work as adversaries. With each round, discriminator gets better at telling real from fake as generator at creating better fakes. Read more about GANs here.

2. Capsule Networks

When our eyes see a 3D object, it can identify hierarchical relationships between object parts. In simple terms, humans can identify objects from different poses! But internal data representation of a convolutional neural network does not take into account important spatial hierarchies between simple and complex objects. This is where capsule networks can help.

Capsule theory has two important parts : collection of neurons called “capsules” and an algorithm for “dynamic routing between capsules”. The algorithm allows capsules to communicate to create what will be similar to scene graphs in computer graphics. This can drastically improve the efficiency of image classification or object identification tasks. Further reading on Capsule Networks.

3. Conscious Machines

Machine consciousness means that the machine is aware of situation or fact. In scary terms, it is like “Skynet of Terminator series becoming self-aware.” In demonstration of this, Columbia Engineering researchers have created a robot that learns what it is. Without any prior knowledge of its build, the robot can create a self-simulation. The robot can then use that self-simulator internally to contemplate and adapt to different situations, handling new tasks as well as detecting and repairing damage in its own body. Further reading on conscious machines : Forbes, Columbia University

4. Contextual AI

Contextual AI refers to applications that can understand user’s context. The system can see the human perspective with enough information about the environment, situation and context. Contextual AI makes applications more personalized. For example, a smart home assistant knows your preferences and learns your habits to provide a more personal experience. Read more about contextual AI at IBM

5. Custom AI Chips

There was a time when only gamers needed GPUs where as today GPUs are used in variety of ML and AI applications.Given the rise of GPU, chip manufacturers such as Intel are creating specialized chips with computing power up to 3 TOPS (Trillion operations per second). Few examples are : Intel Neural Compute Stick

6. Debating Systems

Think of a system that scans newspaper and magazine articles to present a dueling narrative for a topic. That’s what IBM’s Project Debater did! Such systems can help humans build persuasive arguments and make better informed decisions.

What other emerging AI technologies to include in the list? Put your suggestions in the comments below.

India is a growing Internet market with only 29.5% of the population connected to the information superhighway. It is already world’s second largest online market. Unprecedented growth in mobile Internet with world’s cheapest mobile data-rates is currently the biggest growth driver.

Political parties have realized importance of social media better than any Indian corporate. Slogans have turned into hashtags and social media ads are taking over print ads. Social media ads played a great role in the last two general elections to Lok Sabha. It is extremely important for the Election Commission to examine the content of such ads and ensuring accountability of political parties for the expenditure that goes into such online campaigns.

A Timeline of Political Ads Censorship in India

  • 1999 : ECI bans all political ads on electronic media prior to elections.
  • 1999 : In Gemini Television Ltd. and others Vs Election Commission of India, the Andhra Pradesh High Court ruled the ECI ban to be unconstitutional, contrasting provisions of freedom of speech under Article 19 (1) (a) of the Constitution of India. The court also declared the ban to be contrary to the provisions of the RP Act, 1951.
  • 2004 : In Ministry of Information and Broadcasting Vs. M/s. Gemini TV Pvt. Ltd. and others, the Supreme Court mandated all political advertising to be pre-certified by the ECI before broadcast.
  • 2012 : ECI orders formation of MCMC Committee formed for pre-certification of political ads.
  • 2013 : ECI issued guidelines on social media usage during election campaigning.
  • 2014 : ECI in “Compendium of Instructions on Paid News and Related Matters“ states that its earlier order No. 509/75/2004/JS-1/4572 following the SC order of 2004 shall also apply to social media mutatis mutanids.
  • 2019 : Google requires pre-certification of political ads. Facebook starts labeling ads with “published by” and “paid for” disclaimers.
  • 2019 : Facebook and Instagram offer searchable archive of political ads.

How much political parties spent on social media ads in elections 2019?

Credits: Boom

MCMC and political ads on social media

The three-level Media Certification and Monitoring Committee (MCMC) which is organized in national, state and district levels. This committee has an “Intermediary Expert / Social Media Expert”; Here intermediary is what is defined in section 2 (1) [w] of the IT Act, 2000, which means an expert on search engines, web hosting etc.

The committee has access to publicly available spending data. It also has power to remove ads and content which it finds in violation of regulations. District level committees can play greater role if it is provided with more man power and resources.

Challenges in Regulating Social Media Ads

  • Political parties have found a way around transparency policies of large social media platforms by creating their own platforms. A fully functional social network owned by a political party is where we are heading to.
  • How to associate ads run for star-campaigners to individual candidates for expenditure calculation is important challenge as Indian electoral is shifting towards presidential style where local candidates do not matter.
  • Surprisingly low number of political ads on Twitter is an indicator to the fact that political ads on Twitter are not run using the standard ad platform but using an army of paid third parties which operate ghost accounts. Meaning, that political campaigns do not use ads but regular tweets. Getting influencers to retweet your tweets or engaging them in other ways is also seen.
  • Facebook’s searchable archive is a good step forward but here, the challenge for the regulators is what to search for! Weekly data releases also bear the same question! Political advertisers may find ways to escape regulator’s eyes using different names and content choices. How good Facebook is at classifying political ads from non-political ones is another question.
  • WhatsApp which is the most preferred way of social interactions in India provides no public information about groups run by political entities or messages created by them.

India needs a balanced regulatory framework for online political advertising which does not engage in excessive censorship and makes parties more accountable. Opaque services such as WhatsApp can be asked to share metadata of the messages and IT companies can be hired to account expenditure for campaigns run on such platforms. Clear rules should be laid down to bifurcate party spending from candidate spending. Voters should be educated about tools available to fight misinformation.

Disclaimer : The author served as social media expert for ECI - Kutch. Views are personal.

The objective of this post is to present an intuitive overview of features of pandas DataFrame object. Minimum temperature data from 1901 to 2017 provided by data.gov.in is used as an example.

Table of Contents

  1. What is pandas?
  2. Installing pandas
  3. Running this example on Kaggle
  4. Creating a DataFrame from Excel or CSV
  5. Glancing at the data
  6. Statistical overview of the data
  7. Finding the hottest year
  8. Visualizing annual minimum temperature over years
  9. Visualizing temperatures rise and fall (Mean Temp - Months)
  10. Finding hottest seasons (1901-2017)
  11. Finding the most extreme year
  12. Plotting Differences
  13. Looking into abnormal winters

1. What is pandas?

It is a Python library for data analysis. It is interestingly named as acronym of PAnel DAta. It has rich data structures and tools for working with structured data sets common to statistics and other fields. Its main data structure is called DataFrame.

2. Installing pandas

conda install pandas

  • If you have Anaconda installed, you can install pandas using above command.

3. Running this example on Kaggle

4. Creating a DataFrame from Excel or CSV

import pandas as pd
temp = pd.read_excel (‘../input/temp.xls’)
#temp = pd.read_csv (‘../input/temp.csv’)
temp = temp.set_index (temp.YEAR)

  • Firstly, we import pandas library.
  • read_excel () and read_csv() both return DataFrame object. Here we are using read_excel as input file is an Excel file this case.
  • Every DataFrame has an index, in this case we want YEAR column to be the index. set_index() function returns a new DataFrame and doesn’t modify the existing one.

5. Glancing at the data

temp.head()

  • head() returns five first rows from the data with column headers.

6. Statistical overview of the data

temp.describe()

pandas-describe

  • describe() returns basic statistics from the dataset e.g. count, mean, min, max, std etc.

7. Finding the hottest year

temp[‘ANNUAL’].idxmax()

2016

  • idxmax() returns index of the row where column value is maximum. Because YEAR is our index, we get hottest year by finding maximum on ANNUAL column. We can achieve this simply by using idxmax() on ANNUAL column.

8. Visualizing annual minimum temperature over years

import matplotlib.pyplot as plt
x = temp.index
y = temp.ANNUAL

plt.scatter(x,y)
plt.show()

  • We’ve imported matplotlib for plotting.
  • Here a scatter plot with columns ANNUAL against YEAR is plotted.

9. Visualizing temperatures rise and fall (Mean Temp - Months)

mean_months = temp.loc[:,’JAN’:’DEC’].mean()
plt.plot(mean_months.index, mean_months)

JAN 13.167009
FEB 14.656239
MAR 17.774872
APR 21.054274
MAY 23.233846
JUN 23.838291
JUL 23.718462
AUG 23.386838
SEP 22.228974
OCT 19.735299
NOV 16.255470
DEC 13.735641
dtype: float64

  • loc is used to access values by labels. Here we are accessing columns from ‘JAN’ through ‘DEC’.
  • loc when used with [] returns a Series.
  • loc when used with [[]] returns a DataFrame.
  • mean() does not need an explanation.

10. Finding hottest seasons (1901-2017)

hottest_seasons = {‘Winter’ : temp[‘JAN-FEB’].idxmax(),
‘Summer’ : temp[‘MAR-MAY’].idxmax(),
‘Monsoon’: temp[‘JUN-SEP’].idxmax(),
‘Autumn’ : temp[‘OCT-DEC’].idxmax()}
print (hottest_seasons)

{‘Winter’: 2016, ‘Summer’: 2016, ‘Monsoon’: 2016, ‘Autumn’: 2017}

11. Finding the most extreme year

temp [‘DIFF’] = temp.loc[:,’JAN’:’DEC’].max(axis=1) - temp.loc[:,’JAN’:’DEC’].min(axis=1)
temp.DIFF.idxmax()

1921

  • Calculate min() and max() on JAN to DEC columns for each row
  • Calculate difference = max - min for each row
  • Add difference (DIFF) column to the dataframe
  • Do idxmax() on DIFF column

12. Plotting Difference over Years

axes= plt.axes()
axes.set_ylim([5,15])
axes.set_xlim([1901,2017])
plt.plot(temp.index, temp.DIFF)

temp.DIFF.mean()

10.895128205128202

13. Looking into abnormal winters

year_dict = temp.loc[:,’JAN’:’DEC’].to_dict(orient=’index’)
sorted_months = []
for key, value in year_dict.items():
sorted_months.append (sorted(value, key=value.get)[:4])

winter = sorted_months[:]
winter_set = []
for x in winter:
winter_set.append (set(x))
temp[‘WINTER’] = winter_set

winter_routine = max(sorted_months, key=sorted_months.count)

temp.WINTER [temp.WINTER != set(winter_routine)]

YEAR
1957 {FEB, JAN, MAR, DEC}
1976 {FEB, JAN, MAR, DEC}
1978 {FEB, JAN, MAR, DEC}
1979 {FEB, JAN, MAR, DEC}
Name: WINTER, dtype: object

  • Abnormal winters, here, mean a season of four months where most cold temperatures where at least one month is different from commonly observed set of winter months.

References

  1. pandas: a Foundational Python Library for Data Analysis and Statistics, Wes McKinney
  2. Monthly, Seasonal and Annual Mean Temp Series from 1901 to 2017

Perhaps my quest for an ultimate IDE ends with Emacs. My goal was to use Emacs as full-flagged Python IDE. This post describes how to setup Anaconda on Emacs. My Setup:

OS: Trisquel 8.0
Emacs: GNU Emacs 25.3.2

Quick Key Guide (See full guide) :

C-x = Ctrl + x
M-x = Alt + x
RET = ENTER

1. Downloading and installing Anaconda

1.1 Download: Download Anaconda from here. You should download Python 3.x version as Python 2 will run out of support in 2020. You don’t need Python 3.x on your machine. It will be installed by this install script. 1.2 Install:

cd ~/Downloads
bash Anaconda3-2018.12-Linux-x86.sh

2. Adding Anaconda to Emacs

2.1 Adding MELPA to Emacs Emacs package named anaconda-mode can be used. This package is on the MELPA repository. Emacs25 requires this repository to be added explicitly. Important : Follow this post on how to add MELPA to Emacs. 2.2 Installing anaconda-mode package on Emacs

M-x package-install RET
anaconda-mode RET

2.3 Configure anaconda-mode in Emacs

echo “(add-hook ‘python-mode-hook ‘anaconda-mode)” > ~/.emacs.d/init.el

3. Running your first script on Anaconda from Emacs

3.1 Create new .py file

C-x C-f
HelloWorld.py RET

3.2 Add the code

print (“Hello World from Emacs”)

3.3 Running it

C-c C-p
C-c C-c

Output

Python 3.7.1 (default, Dec 14 2018, 19:46:24)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type “help”, “copyright”, “credits” or “license” for more information.

python.el: native completion setup loaded
Hello World from Emacs

I was encouraged for Emacs usage by Codingquark; Errors and omissions should be reported in comments. Cheers!