About Me

My name is Carter Sibley, and I am a data scientist at Kaggle. If you want a shorter, ‘just the facts’ description of this site and the (developing) philosophy of “Overkill Analytics”, please go here. This page is just an excuse to tell a long, boring story about myself.

How I Became a Data Scientist

Based on my childhood, it would not be surprising to anyone that I went into a data science career.   I was the youngest kid at the Commodore 64 user group meetings (after graduating from the Timex Sinclair).   I went to MathCounts.   I spent summers building neural networks in a physics lab.   I was, in short, a certified math and computer geek.

I then made the terrible, terrible mistake of reading a novel by an obscure author – John Grisham – called The Firm.   In case you haven’t read it, The Firm is the story of a bright young man named Mitch McDeere who graduates from Harvard Law School, gets a law firm job making a boatload of money, works 16 hours a day, becomes miserable, finds out he actually works for the mafia, tries to leave, and gets hunted by trained assassins.   In the end, Mitch escapes with a hard-won lesson that ‘easy money’ is never truly easy (and, paradoxically, a stolen Swiss bank account of laundered mob money).

For most, the novel is a harrowing morality tale about using one’s talents to pursue profit over purpose.   As a teenager, my thought was, “Hey, lawyers make a boatload of money as soon as they graduate!”

So just like Mitch McDeere, I went to Harvard Law School, got a high-paying law firm job, worked 16 hours a day, and became miserable.   I never worked for the Five Familes, and I certainly didn’t look like Tom Cruise, but I definitely needed to escape.   I began using my rare off-hours to pursue frustrated technical impulses, secretly dabbling in genetic algorithms and neural network ensembles.  I tried to justify the time by claiming it was for sports betting or improving my poker game, but really I just wanted to play with data.

In the end, I had to stop living the lie.   I sat my wife down and told her that I was not the cocky lawyer I pretended to be, but a code-writing, graph-plotting, algorithm-designing, equation-loving, statistical-journal-reading data geek.  (I think she had known all along.)   I bit the bullet and left law to for the glitz and glamour of insurance pricing models.

It wasn’t a beach in the Caribbean with a fortune in laundered mob money, but I had escaped – just like Mitch.

My Data Science Career

For the ten years since, I have thoroughly enjoyed my work.   As a statistical research director in the insurance industry, I was able to attack some fascinating analytical problems. I later had the opportunity to work as a data science consultant for Oracle’s Big Data team, where I learned about the current state and the potential of predictive modeling across a range of industries. Throughout, I’ve found that I enjoy nearly every aspect my new career: munging the data, building learning algorithms, developing a workflow, improving analytical platforms, and particularly finding and nurturing great analytical talent. I even enjoy reducing the fruits of my labor into terse but colorful slides for senior management.

That being said, I have observed some pain points in the practice of analytics at traditional enterprise companies. The rapid explosion of data now available requires analytics that are robust, iterable, and scalable: i.e., a full-scale development process rather than a few scripts and a PowerPoint deck. In most enterprises, however, analytics are (rightfully) owned by domain experts rather than development teams – making it difficult organizationally to implement this type of process. Moreover, the tools being built to bridge this gap – tools to allow business users to ‘easily’ implement complex machine learning – are often much too generalized to provide utility. Data science problems are highly domain-specific – and producing answers will require platforms and tools developed specifically for each domain.

There are two options for bridging this gap. One alternative is a massive infusion of ‘data scientists’ (i.e., hybrid statistician / developers) across a host of enterprises and industries. The shortage of talent makes this difficult, however – especially if companies want to produce superior machine learning that provides real marginal advantages. The other alternative, therefore, is the one I believe will succeed – centralizing data science talent to attack critical machine learning problems industry by industry.

That’s why I now work at Kaggle. I believe Kaggle is uniquely positioned to build domain-specific data science products that actually meet enterprise needs. The key is Kaggle’s experience in designing and hosting data science competitions to address problems across a huge range of domains. Kaggle has identified how to use a generalized process for high-performing analytic solutions while leveraging highly talented pools of data scientists for the time-intensive, problem-specific work of creating the solutions’ design. Kaggle is now leveraging both the best practices learned from the Kaggle competition process and the best talent identified in those competition to build more comprehensive machine learning solutions, initially in the energy industry. To me, this seems like the best recipe to bring scalable data science to the enterprise, and I’m extremely excited to be a part of the Kaggle team.

Thanks For Reading

There it is, as promised – a long, boring story about me. If you indulged me and read this self-centered narrative, I encourage you to indulge me further and read the actual content of the blog.  Tell me what makes sense to you, and more importantly let me know what I don’t know.

Thanks!

 

7 Comments

  1. Aditya Nag says:

    Hey,

    Inspiring story. I went to law school, but never practiced! Went straight into technology, and never regretted it. Good luck with all that you’re doing.

    Regards,
    Nag

  2. Carter says:

    Thanks. Had fun writing this, glad someone read it.

    In law, and to a lesser extent medicine, I’ve met a number of people misplaced and unhappy in their careers. For every happy lawyer or doctor (like my wife, who is perfectly suited for medicine), there’s another who only went into the field because it was respected, well-paid, encouraged by parents, etc.

    No one ever says, why a lawyer? Why a doctor? (They do say, why a data scientist? Or actually, what the hell is a data scientist?) These professions are easy to choose, if hard to attain, and it definitely leads to misused talent.

    Glad you found your Caribbean beach, so to speak!

  3. […] general approach – consistent with my overkill analytics philosophy – was to abandon any notions of elegance and instead blindly throw multiple tactics […]

  4. Punit says:

    Hi Carter,

    I must admit your story is really motivating and inspiring.
    I started my career in one of the “BIG 4 consulting” firm as a Business intelligence developer, purely focussed on getting data correct, data formatting, dashboards designing.
    But after doing 6yrs of the same job, my interest levels had dropped to the lowest ebb.
    It was then that i made a move to my new company as a BI developer in the marketing analytics team and was exposed to predictive analytics to a small extent. I’m trying to make an entry into this field using R.
    Since you have been through the same phase, what would be your suggestion for ppl like me who are going through a mid-life crisis like me.

    Cheers,
    Punit

    • Carter says:

      Hi Punit:

      Sorry for the late reply, just returning from vacation.

      Thanks for the comments – I’m always surprised what a chord this story strikes with folks. My best advice is to look for firms, industries, and cultures that let you learn through doing and, more importantly, let you establish credentials through work rather than school. It’s hard to prove to that first employer that you can do work you’re not directly trained to do, but once you are in a role and producing results the doubts fall away quickly.

      As I’ve learned late, the type of ‘web presence’ I’m trying to create now – Twitter, this blog, etc. – can be essential. Writing a lot of material that (hopefully) make sense to people can establish ‘street cred’ more quickly than anything else.

      For predictive modeling specifically, Kaggle is a great place to objectively establish that you know what you are doing in the field.

      Hope that helps. Thanks for reading and the kind words!

  5. Eric says:

    That’s great! I went to school for economics, but when I took econometrics, I realized I loved all the programming, stats, and math. I took a consulting gig, where I analyze marketing data, but I really don’t know very much. I’m learning “how to be a data scientist” right now.

  6. Cliff says:

    Hi Carter,

    Really enjoyed your story.
    Just like you, I’m data scientist in the insurance business, so you know that i will read through the contents and hopefully throw in my two cents as well!!

    Regards,

    Cliff

Leave a Reply