My Professional Journey

Hi,
This is Angshuman, a statistician by training, working in the broad area of data science for quite a while. Here is my attempt to take you through some parts of my professional experiences. If you have been through similar routes, if you happen to have similar tastes, if our paths have crossed some time in past or if you think our paths may cross some times in future  - may be it's not a bad idea to say hello to each other.

By the way, this is not my CV, rather my story.

You can reach me at adotsaha@yahoo.com
If you are in LinkedIn, do visit my profile at: in.linkedin.com/in/angshumansaha 

Currently.....

I work for Amazon. I am a Machine Learning scientist there.


A while ago.....

I was a senior scientist in GE Global Research between 2003-2013. I was based out of Bangalore, India.

During this long interesting journey, I got chance to touch diverse domains, handle data from all kinds of sources, get deep into data mining, modeling, analysis to answer some very challenging business questions.

Let me serve you a sampling platter of problems that has kept applied Statistician happily engaged ...

Retailers are trying to get into their customer's minds all the time. What do they like, what do they love? How to engage them, bring them back to store again and again? We worked with some Indian retailers with stores all over the country. Developed end-to-end loyalty management systems, deployed analytics based personalized promotion schemes. Through rigorous case-control studies demonstrated monetary benefit to the retailers. Statistician in the team got to play with loads of detail transaction data, develop algorithms to auto-generate personalized promotions for each individual customer. Scrolling through rows of transaction data, typical conversation among our team will go like -  "He just bought it last month! Can't believe he is going for it again!" or "Told ya, you'd like the offer - see now you went for it on the day you got it!" or "What-d-ya mean you don't like the promo? This is made just for YOU. What's wrong with you?"

Giant engineering systems are not easy to maintain. People put sensors all over the machines to take a peak inside them - are they working all right? We developed algorithms for real-time early warning systems based the sensor-data. Our system generates alarm saying - failure is going to happen. Soon! Shut down the machine NOW! You are bang in the middle of a tug-of-war. Between true positives and false positives. You tweak the algorithm enough - you happily catch all problems - along with a flood of false alarms - operators start cursing you. You tilt the algorithm other way - very few false alarms - lots of true problems escape through the cracks - Hey! Your warning system is no good man! Get the picture?

Seen those beautiful pictures of giant wind turbines rotating in the breeze? Our planet will be so clean and green if we could get all our energy from them. Problem is wind does not obey our wishes. For a thermal plant, you need more energy, you burn more fuel. This doesn't work for wind plants. Accurate forecasting of wind energy output thus become crucial - so that we can plan ahead - we can ramp up other sources of energy production when we expect a shortfall from wind energy. We worked on wind energy forecast systems. Extremely challenging task - given variability of wind, weather and terrain plays havoc on your forecasts. For good energy forecast you need good wind forecast and weather forecast in general is a tough job. The more distant is the future, more uncertain is your forecast. Very short term, it's easier to forecast. But there is a catch. There is this guy called "persistence", always mocking at your sophisticated forecast models. It says, whatever you are observing NOW, is your forecast. This ridiculously simple rule does not need any sophisticated machinery to forecast. Yet, for very short term, it is amazingly difficult to beat with all your sate of the art forecast capabilities.

Have you ever been a part of setting up or maintaining a giant plant? If yes, you will know that thousands of problems, issues, crop up at every corner. People have elaborate IT systems to log, track, resolve and manage the issues. Field reps log them as "trouble-tickets". They write text reports, mails and submit them to centralized systems. This is amazingly rich data - an ocean of collective experience and knowledge that can help an organization understand stuff like - what are our most critical problem? Given a problem, what are the similar problems we have seen in past and how have solved them and so on. Such things can be done relatively easily if humans go through each case manually and organize them. However, sheer volume of such data makes it impossible. Text Mining is the key technology behind extracting such knowledge from unstructured text data. I have been a part of such a project. We worked on searching the cases to extract common themes, problems, group similar cases together and so on. 
We worked on visual representation of - search results, common themes, problem similarities. Text mining pros will tell you (may be not during his work-hours, but in the evening over a drink) that this is a very difficult domain. In free text, people write same thing in ten different ways, they make spelling mistakes, use all kinds of acronyms, shorthands, joke, and sarcasm. Our systems are no match for human reader in deciphering such things. Still, a lot has been accomplished, even with their limited capabilities, such systems deliver a lot of value.

How long will a part of a machine last? What is its life? Depends on, how you operate the machine, in what environment. Probabilistic lifing tries to calculate life as a probability distribution (as opposed to a crisp number). It bakes in the information on operating conditions, environment and so on into the life estimate. In one of our projects we looked at probabilistic lifing of particular components of a complex machine. We computed chance of failure under different probabilistic assumptions on the design of the machine, operating conditions, environment and so on. 

Before that .....

First three years of my tenure in GE I worked for GE Capital. I joined there in 2001. It was called GECIS then. Later on it grew into a separate organization, outside GE, called Genpact. There I worked with Commercial Real Estate (CRE) business of GE Capital. This business used to extend loans to develop or acquire commercial real estate (e.g. large office buildings that can house many offices) across the globe. As statisticians we used to compute stuff like - risk of default on a loan portfolio. A very interesting work we did there was to develop simulation model for computing these risks. Consider an office building with 100 office spaces. We would simulate how tenants in those 100 offices will come and go between say year 1 and year 20. Some tenants will stay long, some will leave early. Once a tenant leaves, office space becomes empty and you need to wait a while to get next tenant. If the economic conditions are bad, you will have hard time getting a new tenant. The rent will rise and fall according to external market and so on. All these lead to a stream of rental incomes that fluctuates. Our model tried to generate these fluctuating income streams and see if that was enough to pay back the loan. Anybody who has worked on such complex simulation environments would tell you that it is a virtual world of its own. So while running the model (it took quite a while to finish a run), we could "see" tenants coming in, staying for a long time or leaving soon, some office space remaining vacant for a long time - all sorts of things. At times it was a lot of fun, at times it was even "spooky". Imagine watching windows of a large office building from a distance and see lights going on and off, people walking in and out!


Earlier.....

I joined Tata Infotech Ltd. in Mumbai, India in 1999. I was fresh out of PhD program. It was my first industry experience. I joined ATG (Advanced Technology Group), the R&D wing of the company. We were a team of five-six colleagues with statistics, com. science kind of background. Very young team. It was easier gelling-in in the industry than I thought. Learned new things, saw many myths vanish before my eyes. We were into data mining, data analysis. Looked at customer survey data, picked up few tools of the trade. One highlight there was working on statistics software called XLMiner. It is now a well-known data mining add-on to Excel, developed by Cytel Inc. It was early days and XLMiner was just getting conceptualized and designed. I was working on the GUI design. I was providing my inputs on how a statistics user will interact with the screens, what he will expect as defaults, how he might like to see the results organized and so on. This was first time I was getting involved in GUI design and thinking about usability issues, features and bugs of a software. 

While in school.....

University of Washington
, Seattle (1994-98):

During 1994-98 I was in the PhD program in the Statistics department on University of Washington in Seattle. First time outside the country, completely on my own. It was an interesting experience. It took some time getting used to grad student lifestyle. Tough courses, weekly assignments. Fear of failing in qualifying exams at the end of the year always loomed large on us. Teaching experience as a TA was something new. Not-so-pleasant feedback from students in TA class hit hard the lofty image of my own teaching capabilities. Meeting big names in Statistics world, taking their classes was a nice experience.

One experience that had a long lasting impact on my thinking and professional career was that of meeting my advisor Prof. Finbarr O'Sullivan. His deep knowledge, pleasing personality, very pragmatic down-to-earth approach towards statistics, his care and concern for his students  - all impressed me enormously. His influence helped me in transforming from grad student to a professional applied statistician.

My thesis work was related to statistical issues in PET (Positron Emission Tomography) imaging. People used what are known as "compartmental models" to describe how PET tracer spreads in different parts of tissue. These models have parameters that need to be estimated from PET image data. We worked on these statistical estimation methods. Values of these parameters have clinical significance and can be used to understand state of a disease.

In 1997 Prof. O'Sullivan went for a one year sabbatical to his native place Cork, Ireland. He took us, three of his students with him. For one year I was continuing working with him as well as working as a visiting lecturer in the Statistics department in University College Cork . That was an experience of life in academia. I was designing courses on my own, teaching both undergrad and grad classes. Also had an opportunity to conduct statistics training sessions for industry.


I graduated from the PhD program in 1998 and headed back home, India.

Indian Statistical Institute (1990-94):

    This all of us have heard so many times. And I cannot agree more. College days were some of my most memorable days. I joined Indian Statistical Institute (aka ISI) in 1990 in the B.Stat (Bachelors of Statistics) program. Came out of ISI in 1994, after completing my M.Stat (Master of Statistics) degree. From high school to ISI was a big leap. It was a different world altogether. In those days in 1990, ISI was much less known to world outside academia. Lot of people used to confuse this ISI with another ISI - Indian Standards Institute. Because the "other" ISI was much more visible to people though "ISI mark" (a seal on quality assurance from Govt.) on everyday household stuff like "GI Surya pipe" or "Lakshmi KaDai". So there was very little glam-value associated with ISI (as opposed to say, IIT - Indian Institute of Technology, a dream destination of would-be engineers). Somehow I took the entrance test, got selected and my life at ISI started.
   
    Things were interestingly and pleasantly different from the very beginning. Take the entrance test, for example. We were tested only on math skill. Mostly multiple-choice test. Relatively few questions, plenty on time to write the exam. This was in a sharp contrast with other popular entrance exams (like IIT-JEE), where you had to write a lot, you were always under time pressure, you were not expected to answer all the questions in the given time. In ISI test, at the first glance it seemed like a breeze - but the keyword here is "seemed". As we got down to answer question 1, unexpected twists, surprises, confusions were staring at us with broad grins. Another shocker for the guys who don't mind a little help from the friend sitting next to him. Questions as well as answer options were differently ordered in different exam scripts. Question 1 for me is actually question 14 for my friend. By the time you realize that, next shocker - for me (B) is the correct choice for Q1 and for my friend (D) is the correct choice for Q14. Pretty interesting!

  Classes started. We were in a brave new world of abstract math. That kind of math we have never seen before. Great teachers, experts in their subjects, absolutely awe-inspiring. Though we went to learn statistics, we found it was one of the best places to learn Mathematics on the way. Five years passed in a blink of an eye. Peer group was fabulous. We learned as much from our colleagues as from our teachers. In every batch we saw geniuses who are way ahead of the rest in their knowledge in statistics, mathematics, computer science. I stayed in the hostel in the campus. The hostel life was amazing. I am sure all you guys who have stayed in college hostels, have fond memories. One pleasantly strange thing about our hostel life was  - a lot of our discussions, jokes, and poems were very mathy and technical in nature. I think a lot of ISI alumni out there share this same experience. That is perhaps both good and bad. If you "belong" there, you belong there so much that the rest of the world seems relatively dull.
Here is a test for you. I'll tell you a "joke" I heard in my early days. Does it blow you away or do you feel it is strange nerdy piece with no joke-value at all?

In our early days while we were learning the principle of proof by mathematical induction, somebody demonstrated the principle through following example. On my way to hostel I saw a dead dog on the road. It was run over by a truck. I will prove - "All dogs die this death" and I'll prove it by mathematical induction. Clearly this is true for n=1, because here is the dead dog. Suppose, the assertion is true for n=k. If I can prove it for n=(k+1), I am done, by the principle of mathematical induction. So let's assume it is true for n=k. The one I see on the road is (k+1)th dog. Hence I have proved - "All dogs die this death."   - What d'ya think?   
   
We learned various branches of statistics and mathematics at ISI. We learned a bit of computing. It was mainframe days there. We were blissfully unaware of PC, Windows, Office, Internet, Web. Some streams grabbed our full interest. For some other streams like Abstract Algebra or Functional Analysis, it took only one or two classes to realize that we are not cut out for this stuff. We picked up a lot of theory stuff of statistics, relatively little of applied statistics. I wish I could get more of the later from the great teachers, however that was the reality of those days. The PhD program in University of Washington gave me the tools of applied statistician and that eventually lead to where I am today.     
   
School Days (1982-89):

I went to a Ramakrisha Mission at Narendrapur, West Bengal ( ) for my schooling. It was a residential school and college. I have fond memories of that beautiful lush green campus. I spent a lot of my childhood days there, away from home. I feel those days have taught me a lot of very fundamental things that has helped me in the long run. Some of these are stuff like - managing your day-to-day life on your own, cleaning your room, making your bed, some discipline in daily routine and so on. However, personally, I felt that all these discipline we imbibed in a rather sub-conscious way. Our daily lives were filled with joy and fun. Fun of playing and chatting with friends, fun of bending rules here and there, thrill of an "improved" menu during Sunday lunch. Another amazing thing the school did for us was it organized events all round the year - events like sports meet, football and cricket matches, quiz, debate, recitation, extempore speeches, music competition, drama competition, science exhibitions - the list is really long.  Those helped us pick up skills like public speaking, expressing yourself through various art forms and so on. I'd really love to hear from the ex-students of the school what you think you have picked up from your days there.

So you took the journey.....


    If you have reached here reading through the long long piece - congratulations! Frankly, I had doubts that you will reach here.
    And thanks too.

    Leave your comments, let me know your thoughts. May be you will also have interesting stories to share.


Ċ
Angshuman Saha,
Jan 17, 2012, 4:23 AM
Ċ
Angshuman Saha,
Jan 17, 2012, 4:23 AM
Comments