r Bibin Jose - CV

Avatar

Bibin M Jose



Data Scientist

Bangalore, India

bibinmjose@gmail.com


Detail-oriented Data Scientist with the ability to draw critical insights from data. Highly experienced in design, implementation and packaging of Deep learning algorithms.


Programming

Python

100%

Bash Scripting

80%

R

50%

MATLAB

70%

Platforms and Frameworks

Tensorflow PyTorch

PyMC3 scikit-learn

AWS Sagemaker S3 EC2


Databases

SQL MongoDB


Other Tools

Markdown IPython Notebook

IgorPro HTML

Photoshop VSTS ImageJ

git openCV skimage dimple.js


Strengths

  • Package Management

  • Code Readability

  • Tooling and Reusability (Modularity)




Work Experience (Data Science only)

Data Scientist
Affine Analytics, Bangalore, India
Jan 2018 - Current

Worked on various analytics projects, mainly focused on Computer Vision(image segmentation), Forecasting(DeepAR) and Uncertainty estimation (all using deep neural nets). Due to the confidential nature of projects, details are not mentioned here.

Image Segmentation CNN LSTM DeepAR Probabilistic Programming Bayesian Statistics Time Series Python Packaging

Data Analyst
Technoconsulting Corp, Dunellen, NJ, USA
Mar 2016 - Feb 2017

Worked in liaison with the Business Analyst to develop the visual analytics for the data


Research Assistant
Microfluidics and Interfacial Fluid Dynamics Laboratory, Stony Brook University, NY, USA
Aug 2009 - May 2015

Published 7 papers in top tier journals acquiring the ability to tell compelling stories using creative data visualization techniques

Investigated highly viscous immiscible two-phase flows in microfluidic channels


Education

Stony Brook University, New York, USA
M.S. and Ph.D. in Mechanical Engineering
Aug 2009 - May 2015

College of Engineering, Kerala University, Trivandrum, India
B.Tech in Mechanical Engineering
Aug 2003 - May 2007


Certifications

    • Management Strategy Institute – Six-Sigma Black Belt Professional
    • Datacamp – Statistics with R, Data Visualization and Manipulation in R

Awards and Honors

    • Best Poster Award – Annual Student Research Symposium (2012)
    • Fraçois Frenkel Award – American Physical Society – DFD (2012)
    • National Science Foundation – Research Mentoring Award (2012, 2011)


Publications



Data Science Projects

Finding Donors for Charity

Modelling individuals' income using data collected from the 1994 U.S. Census. Final goal is to construct a model that accurately predicts whether an individual makes more than $50,000. Such scenarios arise in a non-profit setting, where organizations survive on donations.

Supervised Learning Naive Bayes K-Nearest neighbours Logistic Regression Random Search Feature Selection

Customer Segmentation

Segmenting customers based on channel and region

clustering PCA soft-clustering scikit-learn encoding unsupervised learning bi-plot

Student performance in Indian Schools

Identifying key factors leading to 8th grade student performance in Indian schools.

Data analysis python pandas HTML markdown Visual Encodings matplotlib

Visualizing Data: Baseball Player Statistics

Used dimple.js (a d3 library built on javascript) to create visualizations based on the statistics of 1,157 baseball players.

Data Visualization D3 Dimple.js HTML Visualization design principles Visual Encodings Animation javascript

Identify Fraud from Enron Email

Email between Enron employees are used to make a predictive model for identifying POI using scikit-learn package in Python.

machine learning python scikit-learn PCA Naive Bayes Logistic Regression Decision Tree feature selection natural language processing

Exploratory Data Analysis of Red Wine

Univariate, Bivariate, Multivariate plots and summary statistics to explore data and relationships between them. Data Visualizations are used to compare and identify trends.

RStudio R packages ggvis tidyverse Correlation Matrix RMarkdown Reporting

Data Wrangling of OpenStreetMap

Audited OpenStreetMap data for validity, accuracy, completeness, consistency and uniformity.

Cleaned data from many large files, stored, queried, and aggregated data using MongoDB

MongoDB python Data Parsing Data Verification Data Cleaning Data Audit Database Query IPython Notebook

Test a perceptual Phenomena

Analyzed the data set from Stroop task using hypothesis testing to differentiate the effect of congruent and incongruent words.

Numpy python Pandas Statistics Histogram Probability Distribution T - test Hypothesis Testing IPython Notebook

Analysis of Titanic Data

Exploratory Data Analysis of Titanic Data

Numpy python Pandas IPython Notebook

Bay Area Bike Share Analysis

Go Bike is a company that provides on-demand bike rentals for customers in San Francisco. Based on their open dataset, created visualizations and performed an exploratory data analysis.

Numpy python Pandas IPython Notebook