Notes from Industry

About a year ago I suggested two peer review processes for data science projects, outlined a structure for the process — including separate review of the research phase and the model design and implementation — and positioned it within the wider scope of a data science project flow (as it…


The Problem

Ever had a Python project that uses a tool or package that is configured by environment variables to authenticates with some service? …


Making your work more error-proof using peer scrutiny

Peer review is an important part of any creative activity. It is used in research — both inside and outside academia — to ensure the correctness of results, adherence to the scientific method and quality of output. In engineering it is used to provide outside scrutiny and to catch costly…


A concise review of the major approaches.

The question of what event caused another, or what brought about a certain change in a phenomenon, is a common one. Examples include whether a drug caused an improvement in some medical condition (versus the placebo effect, additional hospital visits, etc.), …


A review of notable literature on the topic

Word embedding — the mapping of words into numerical vector spaces — has proved to be an incredibly important method for natural language processing (NLP) tasks in recent years, enabling various machine learning models that rely on vector representation as input to enjoy richer representations of text input. …


A practical guide to packaging Python code

Say you have a nice piece of Python code; a couple of small related functions, or perhaps even a medium-sized module with a few hundred lines of code. …


Stationarity is an important concept in time series analysis. For a concise (but thorough) introduction to the topic, and the reasons that make it important, take a look at my previous blog post on the topic. Without reiterating too much, it is suffice to say that:

  1. Stationarity means that the…


Testing open-source Python on several operating systems

Say you have an open source Python project or package you are maintaining. You probably want to test it on the major Python versions that are currently in wide use. You definitely should. In some cases you might also need to test it on different operating systems. …


A review of the concept and types of stationarity

This post is meant to provide a concise but comprehensive overview of the concept of stationarity and of the different types of stationarity defined in academic literature dealing with time series analysis.

Future posts will aim to provide similarly concise overviews of detection of non-stationarity in time series data and…


A data scientist’s take on our process

I was recently asked by a startup I’m consulting (BigPanda) to give my opinion about the structure and flow of data science projects, which made me think about what makes them unique. Both managers and the different teams in a startup might find the differences between a data science project…

Shay Palachy

Data Science consultant. www.shaypalachy.com

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store