Powerful insights that help you make smarter decisions. http://datamart.org Sat, 13 May 2017 08:48:34 +0000 en-US hourly 1 https://wordpress.org/?v=4.7.5 Michael Hochster, PhD in Statistics, Stanford; Director of Research, Pandora – As a data scientist, how do you answer when non-technical people ask you “is your analysis result statistically significant?” http://datamart.org/2017/05/13/michael-hochster-phd-statistics-stanford-director-research-pandora-data-scientist-answer-non-technical-people-ask-analysis-result-statistically-significa/ Sat, 13 May 2017 08:48:34 +0000 http://datamart.org/?p=7141

Answer They are asking whether you have enough data to trust your results. I would try to answer the real question and not worry too much about whether the technical jargon is being used correctly (that’s my job, not theirs).

The post Michael Hochster, PhD in Statistics, Stanford; Director of Research, Pandora – As a data scientist, how do you answer when non-technical people ask you “is your analysis result statistically significant?” appeared first on Powerful insights that help you make smarter decisions. .

]]>
datascient12Answer They are asking whether you have enough data to trust your results. I would try to answer the real question and not worry too much about whether the technical jargon is being used correctly (that’s my job, not theirs).

The post Michael Hochster, PhD in Statistics, Stanford; Director of Research, Pandora – As a data scientist, how do you answer when non-technical people ask you “is your analysis result statistically significant?” appeared first on Powerful insights that help you make smarter decisions. .

]]>
Tetiana Ivanova – How to become a Data Scientist in 6 months a hacker’s approach to career planning http://datamart.org/2017/03/02/tetiana-ivanova-become-data-scientist-6-months-hackers-approach-career-planning/ Thu, 02 Mar 2017 23:17:56 +0000 http://datamart.org/?p=7137

You don’t need a PhD or even a masters to do machine learning. On taking calculated risks and especially calculated exits from one’s comfort zone. Some notes on soul searching and how to choose a career that is also a passion. Reading list.

The post Tetiana Ivanova – How to become a Data Scientist in 6 months a hacker’s approach to career planning appeared first on Powerful insights that help you make smarter decisions. .

]]>
You don’t need a PhD or even a masters to do machine learning. On taking calculated risks and especially calculated exits from one’s comfort zone. Some notes on soul searching and how to choose a career that is also a passion. Reading list.

The post Tetiana Ivanova – How to become a Data Scientist in 6 months a hacker’s approach to career planning appeared first on Powerful insights that help you make smarter decisions. .

]]>
How to Become a Data Scientist in 2017? | Data Scientist Career | Data Science Future http://datamart.org/2017/03/02/become-data-scientist-2017-data-scientist-career-data-science-future/ Thu, 02 Mar 2017 23:08:48 +0000 http://datamart.org/?p=7133

Jesse Steinweg-Woods is soon-to-be a Senior Data Scientist at tronc, working on recommender systems for articles and understanding customer behavior. Previously, he worked at Argo Group Insurance on new pricing models that took advantage of machine learning techniques. He received his PhD in Atmospheric Science from Texas A&M University, and his research focused on numerical […]

The post How to Become a Data Scientist in 2017? | Data Scientist Career | Data Science Future appeared first on Powerful insights that help you make smarter decisions. .

]]>
Jesse Steinweg-Woods is soon-to-be a Senior Data Scientist at tronc, working on recommender systems for articles and understanding customer behavior. Previously, he worked at Argo Group Insurance on new pricing models that took advantage of machine learning techniques. He received his PhD in Atmospheric Science from Texas A&M University, and his research focused on numerical weather and climate prediction.

The post How to Become a Data Scientist in 2017? | Data Scientist Career | Data Science Future appeared first on Powerful insights that help you make smarter decisions. .

]]>
Select Features and Target in Scikit Learn http://datamart.org/2017/02/22/select-features-target-scikit-learn/ Wed, 22 Feb 2017 15:30:26 +0000 http://datamart.org/?p=7122

To do Machine Learning in SKlearn, as a first step we need to import following import pandas as pd import numpy as np Step 1. We read the file in Panadas Dataframe by pd.read_csv. In jupyter Note book we defined dataframe as df=pd.read_csv(‘C:\Data\glass.csv’) In order to select features and target for machine learning we will […]

The post Select Features and Target in Scikit Learn appeared first on Powerful insights that help you make smarter decisions. .

]]>
To do Machine Learning in SKlearn, as a first step we need to import following
import pandas as pd
import numpy as np
Step 1. We read the file in Panadas Dataframe by
pd.read_csv.
In jupyter Note book we defined dataframe as
df=pd.read_csv(‘C:\Data\glass.csv’)
Pic 1 Data impoort in SKLearn

In order to select features and target for machine learning we will use the following commands

X=df[list(df.columns)[:-1]] input
y = df[‘Type’]
X is (Features)
Y is (Target)
By using above command X=df[list(df.columns)[:-1]] we removed the Type Column from input features and then used the y = df[‘Type’] as (Target).
If check X by running X.info() we will have the following columns

Int64Index: 214 entries, 0 to 213
Data columns (total 9 columns):
RI 214 non-null float64
Na 214 non-null float64
Mg 214 non-null float64
Al 214 non-null float64
Si 214 non-null float64
K 214 non-null float64
Ca 214 non-null float64
Ba 214 non-null float64
Fe 214 non-null float64
dtypes: float64(9)
memory usage: 16.7 KB

you will notice ‘TYPE’ column which is target variable is not shown below as we used the [:-1] which removed the last column which was target column. This is very useful command if we enter -2 then last 2 columns will be removed and so on.;

The post Select Features and Target in Scikit Learn appeared first on Powerful insights that help you make smarter decisions. .

]]>
Pre-Modeling: Data Preprocessing and Feature Exploration in Python http://datamart.org/2017/02/13/pre-modeling-data-preprocessing-feature-exploration-python/ Mon, 13 Feb 2017 14:58:35 +0000 http://datamart.org/?p=7112

In my recent seach on building dummy variables for a loan dataset which I downloaded from kaggle I came across this tutorial on Data preprocessing and feature exploration – step critical building machine learning models models. Though there are still more information I am searching on creating dummy variable, however I like the way presenter […]

The post Pre-Modeling: Data Preprocessing and Feature Exploration in Python appeared first on Powerful insights that help you make smarter decisions. .

]]>
datapreIn my recent seach on building dummy variables for a loan dataset which I downloaded from kaggle I came across this tutorial on Data preprocessing and feature exploration – step critical building machine learning models models. Though there are still more information I am searching on creating dummy variable, however I like the way presenter April Chen presented explained dummy variables, feature building and reducing feature. AS well as elaborating how automated methods can have pros and cons like unable to explain how they work, It very insightful presentation and must for every doing machine learning. Watch the tutorial here

The post Pre-Modeling: Data Preprocessing and Feature Exploration in Python appeared first on Powerful insights that help you make smarter decisions. .

]]>
Loading External Data into Scikit lear AKA SKLEARN http://datamart.org/2017/01/24/loading-external-data-scikit-lear-aka-sklearn/ Wed, 25 Jan 2017 04:13:25 +0000 http://datamart.org/?p=7108

Scikit learn comes with some toy datasets such as iris. We can load these with from sklearn import neighbors, datasets iris = datasets.load_iris() print(iris) However if we try to load any other external, for example I downloaded the wines data from we have to We have to import pandas into SKlearn and then use the […]

The post Loading External Data into Scikit lear AKA SKLEARN appeared first on Powerful insights that help you make smarter decisions. .

]]>
uciScikit learn comes with some toy datasets such as iris. We can load these with
from sklearn import neighbors, datasets
iris = datasets.load_iris()
print(iris)

However if we try to load any other external, for example I downloaded the wines data from uci we have to

We have to import pandas into SKlearn and then use the following command.
wine=pd.DataFrame.from_csv(‘C:\Data\wine.csv’, index_col=None, encoding=’utf-8′)
wine.head()

The post Loading External Data into Scikit lear AKA SKLEARN appeared first on Powerful insights that help you make smarter decisions. .

]]>
Full Titanic Example with Random Forest http://datamart.org/2017/01/19/full-titanic-example-random-forest/ Fri, 20 Jan 2017 05:54:50 +0000 http://datamart.org/?p=7105

I found this presentation of Random Forest quite well explained and learned the parameters tuning of optimizing the model as well as Reducing dimensions. That is fairly simple example but helpful in directing us to more advancxed features. Enjoy..

The post Full Titanic Example with Random Forest appeared first on Powerful insights that help you make smarter decisions. .

]]>
I found this presentation of Random Forest quite well explained and learned the parameters tuning of optimizing the model as well as Reducing dimensions. That is fairly simple example but helpful in directing us to more advancxed features. Enjoy..

The post Full Titanic Example with Random Forest appeared first on Powerful insights that help you make smarter decisions. .

]]>
Advanced Machine Learning with scikit-learn http://datamart.org/2017/01/03/advanced-machine-learning-scikit-learn/ Tue, 03 Jan 2017 22:46:59 +0000 http://datamart.org/?p=7101

An excellent present on some advanced features of Scikit Learn. This video explore an d present the example on Handwritten digit classification using SVM , Find best best parameters using grid search. This Video also explored doing machine learning on computer network. Lastly a great exaple on Text Data. I selected this video based clarity, […]

The post Advanced Machine Learning with scikit-learn appeared first on Powerful insights that help you make smarter decisions. .

]]>
An excellent present on some advanced features of Scikit Learn. This video explore an d present the example on Handwritten digit classification using SVM , Find best best parameters using grid search. This Video also explored doing machine learning on computer network. Lastly a great exaple on Text Data. I selected this video based clarity, easy to understand advanced topic of Machine learning woth scikit learn. This tutorial will offer an in-depth experience of methods and tools for the Machine Learning practitioner through a selection of advanced features of scikit-learn and related projects. This tutorial targets developers already familiar

The post Advanced Machine Learning with scikit-learn appeared first on Powerful insights that help you make smarter decisions. .

]]>
Selecting Null when there is empty string http://datamart.org/2016/12/14/selecting-null-empty-string/ Wed, 14 Dec 2016 10:27:13 +0000 http://datamart.org/?p=7096

Many times we have a limited access to the data and views are created by IT departments and we have to select only the null or empty fields in Crystal Reports. I frequently came across this kind of issue and I tackled this by using isnull({Field})=True or {Field}=”” in the record selection formula editor. Kaleem […]

The post Selecting Null when there is empty string appeared first on Powerful insights that help you make smarter decisions. .

]]>
crystalreportsMany times we have a limited access to the data and views are created by IT departments and we have to select only the null or empty fields in Crystal Reports. I frequently came across this kind of issue and I tackled this by using isnull({Field})=True or {Field}=”” in the record selection formula editor.

Kaleem Mian
Senior Consultant SAP Crystal Reports

The post Selecting Null when there is empty string appeared first on Powerful insights that help you make smarter decisions. .

]]>
http://datamart.org/2016/12/04/7092/ Mon, 05 Dec 2016 02:27:23 +0000 http://datamart.org/?p=7092

Excellent presentation on how to perform predictive analytics on text data. It was presented in Pycon 2016. In this video an example on how to identify spam using Naive Base and regression analysis are used. Very clear presentation by Kevin Markham, Enjoy…Although numeric data is easy to work with in Python, most knowledge created by […]

The post appeared first on Powerful insights that help you make smarter decisions. .

]]>
Excellent presentation on how to perform predictive analytics on text data. It was presented in Pycon 2016. In this video an example on how to identify spam using Naive Base and regression analysis are used. Very clear presentation by Kevin Markham, Enjoy…Although numeric data is easy to work with in Python, most knowledge created by humans is actually raw, unstructured text. By learning how to transform text into data that is usable by machine learning models, you drastically increase the amount of data that your models can learn from. In this tutorial, we’ll build and evaluate predictive models from real-world text using scikit-learn.

The post appeared first on Powerful insights that help you make smarter decisions. .

]]>