How can I become a data scientist from an absolute beginner level to an advanced level?

How can I become a data scientist from an absolute beginner level to an advanced level?  Awantik Das
Posted on May 22, 2017, 12:01 p.m.

Learn data science

Here is the journey that works. Being a trainer myself, I know many success stories

Most importantly, believe that you can do it. At least 30% work gets done with this.

Python is definitely best programming language for this because of ease & absorption. Spend 25–30 hours mastering this. Be through with list, dictionary, functional programming, classes, regular expression, iterators & generators.

Start with NumPy, ~5 hours is fine for this. A wide variety of documents are available for this.

Pandas are extensively used in data wrangling & processing. This would need ~10 hours. Do some csv, excel data processing

Data Visualization is another important aspect. Matplotlib is definitely a good one to do. Another ~5 hours will boost your confidence.

scikit-learn - a python machine learning library have many datasets already available. Start using those.

Now, time for machine learning. Remember & tell yourself, that you don’t have to reinvent algorithms here. The first level is to make use of them to solve problems.

First, on very simple datasets try applying linear regression. Now, jump into pipeline, hyperparameter & cross-validation concept of scikit learn. This will give a fair idea to solve problems.

Then, go for other algorithms like classification, clustering.

At this stage, you will be very confident to solve any problem & dig deeper into machine learning algorithms.

Happy Learning.


Awantik Das is a Technology Evangelist and is currently working as a Corporate Trainer. He has already trained more than 3000+ Professionals from Fortune 500 companies that include companies like Cognizant, Mindtree, HappiestMinds, CISCO and Others. He is also involved in Talent Acquisition Consulting for leading Companies on niche Technologies. Previously he has worked with Technology Companies like CISCO, Juniper and Rancore (A Reliance Group Company).




Keywords : data-science spark


Recommended Reading


Things you need to know to have a career in the Internet of Things (IoT)

Many people want to take up a career in the Internet of Things, but let me tell you this, it will be a stiff mountain to climb. But once you have reached the peak then there’s no stopping you. Here are some things you should know to have a career in the Int...


How to start learning IoT?

Before learning IoT we should know what is IoT? IoT generally refers to the “Internet of things”. IoT technology empowers things around us to be always connected to the internet and enables them to communicate with one another in real-time. IoT is just per...


Container is the new process and Kubernetes is the new Unix.

Once a microservice is deployed in a container it shall be scheduled, scaled and managed independently. But when you are talking about hundreds of microservices doing that manually would be inefficient. Welcome Kubernetes, for doing container orchestration ...


zekeLabs among Top 10 destinations to learn AI & Machine Learning

Artificial Intelligence course from zekeLabs has been featured as one of the Top-10 AI courses in a recent study by Analytics India Magazine in 2018. Here is what makes our courses one of the best in the industry.


What are Big Data, Hadoop & Spark ? What is the relationship among them ?

Big Data is a problem statement & what it means is the size of data under process has grown to 100's of petabytes ( 1 PB = 1000TB ). Yahoo mail generates some 40-50 PB of data every day. Yahoo has to read that 40-50 PB of data & filter out spans. E-commerce...