How to set up a Data Science environment on Windows using Anaconda

by mark · Published 1 April 2020 · Updated 1 April 2020

So you’ve just started learning about Data Science and don’t know how to *practically* get started using Python. Well this is the right place. While you might think this is a difficult thing to do, it actually is a very simple task with Anaconda, let’s get started!

Requirements

There are literally no requirements to start your environment. Later down the road, you might find that certain tasks such as training Artificial Neural Networks (ANN) may require specific hardware such as a supported Graphics Processor Unit (GPU). Apart from the exception of ANN most of Machine Learning techniques will not require such power, and in any case when you have developed something interesting you want to transfer that something onto a serious production cluster.

Generally speaking the better the machine you have, the faster algorithms will run and you will be able to tackle entire dataset rater than subsets of them.

Anaconda: the great Python experience

The silent Data Science revolution is brought by Anaconda: a package manager (and more) for Python! Anaconda is an integrated environment that contains everything you need to get started. You can get it for free here. Unless there is a very specific reason, you will want to choose the Python 3 variant. This is due to the fact Python 2 is deprecated as of 1st Jan 2020.

Walk through the setup and be sure to check both “Add Anaconda to my PATH environment variable” and “Register Anaconda as my default Python 3.*“

That’s it! You now have your environment installed. Now let’s move onto making your environment a Data Science environment.

Anaconda: the first steps towards Data Science

The first tool you may need is the Anaconda Navigator: a useful, handy, tool to manage your installation graphically. Here you can head to Environments and create a new environment. Environments are like separate installations of Python, you may have different environments with different versions of Python and packages.

It’s always best to keep the base as it is and create a new environment for your needs. Once you have created an environment you can go on and install the following packages:

numpy
scikit-learn
matplotlib
pandas
seaborn

In the case you’re wondering “why is there no Jupyter-notebook” it is because Anaconda installs Jupyter Notebook by default (you can find it in the Anaconda Navigator). You may also add other packages that you need such as Tensorflow.

If you are more of a command line guy (as most people reading on this site) you can do everything through the command line! If you search from your start menu for “Anaconda Prompt” you will be able to open a shell. From here do (the “>” is the prompt don’t include it in your commands!):

> conda create --name myenv
> conda activate myenv
> conda install numpy scikit-learn matplotlib pandas seaborn

The first command will create a new environment named “myenv”, the second one will activate the environment (all commands will be executed within the environment), the third one installs the suggested packages.

That’s it!

Whether you used the graphical method or the command line you now have a Python Data Science environment. Your principal point of access is the Jupyter Notebook (you can also try JupyterLab if you want to). You can start Jupyter through the Anaconda Navigator or through the command line using “jupyter notebook” in the Anaconda prompt. Anaconda will also create a useful Jupyter shortcut in your start menu.

Image courtesy of mark | marksei

Author
Recent Posts

mark

The IT guy with a slight look of boredom in his eyes. Freelancer. Current interests: Kubernetes, Tensorflow, shiny new things.

Cookie	Duration	Description
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_gtag_UA_60468161_1	past	Set by Google to distinguish users.
_ga_DR9SCJ09BV	2 years	This cookie is installed by Google Analytics.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
__gads	1 year 24 days	The __gads cookie, set by Google, is stored under DoubleClick domain and tracks the number of times users see an advert, measures the success of the campaign and calculates its revenue. This cookie can only be read from the domain they are set on and will not track any data while browsing through other sites.

Cookie	Duration	Description
edgebucket	session	Reddit sets this cookie to save the information about a log-on Reddit user, for the purpose of advertisement recommendations and updating the content.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
test_cookie	14 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
csv	2 years	No description available.
GoogleAdServingTest	session	No description
wp_api	past	No description
wp_api_sec	past	No description
_pk_id.1.95fa	1 year 27 days	No description
_pk_ses.1.95fa	29 minutes	No description
__smSessionId	9 hours	No description available.
__smToken	1 year	This cookie is set by the Sumo. This cookie is used for verifying whether the user is logged in or not.

How to set up a Data Science environment on Windows using Anaconda

Requirements

Anaconda: the great Python experience

Anaconda: the first steps towards Data Science

That’s it!

You may also like...

Leave a ReplyCancel reply

Recent Posts

Recent Comments

Categories

Latest tutorials

How to set up a Data Science environment on Windows using Anaconda

Requirements

Anaconda: the great Python experience

Anaconda: the first steps towards Data Science

That’s it!

Related posts:

You may also like...

How to install Ubuntu 16.04 LTS in 10 easy steps

How to install Mageia 6 in 10 easy steps

How to install Fedora 25 in 10 easy steps

Leave a ReplyCancel reply

Recent Posts

Recent Comments

Categories

Latest tutorials