Which Universities Have the Best Coders in the World?

photo-1470378639897-89788e74b7bf (1)

With early college admissions under way for many universities around the country, we got to thinking: Which colleges have the best coders in the world?

While there are academic rankings, like the Top Computer Science Programs by US News & World report, there is no list that ranks colleges purely by their students’ ability to code. The criteria for the US News & World Report, for instance, includes number of research papers produced, global research reputation and number of conferences. In fact, practical coding skills aren’t even part of their methodology at all.

We decided to answer the question: Which universities have students who can roll up their sleeves and code?

At HackerRank, millions of developers, including hundreds of thousands of students, from around the world regularly solve coding challenges to improve their coding skills. In order to figure out which colleges have the best coders, we hosted a major University Rankings Competition. Over 5,500 students from 126 schools from around the world participated in the event. Companies also assess developers’ coding skills using HackerRank to hire great developers. 

According to our data, the top three best coders in the world hail from:

  1. Russian Federation College, ITMO University | Russia
  2. Sun Yat-sen Memorial Middle School | China
  3. Ho Chi Minh City University of Science | Vietnam

The University of California, Berkeley was the #1 college in America, and came in fourth overall.

***First, we defined what it means to be the “best” university. We thought it would be fairest to rank universities based on both number of participants and high scores. Our engineering team created a formula* to rank each university. Each university had to have at least 10 participants to place on the leaderboard.

We narrowed the data to the top 50 colleges around the world:

University_Developers_list

Two Russian universities ranked #1 and #6, respectively in the HackerRank University Competition. Meanwhile, Russian universities aren’t listed among the top 50 universities in the traditional US News & World report list. Similarly, we found that Vietnam’s Ho Chi Minh university has talented coders, but they didn’t rank high in US News & World report either.  

This is not to say that the US News & World Report is misguided. Instead, the results of the HackerRank University Competition suggest that such traditional academic rankings aren’t the only source of the best coders in the world.

In fact, one acclaimed high school in China blew many universities out of the water.  San Yat-sun Memorial Middle School (which in the US equates to high school level of education), placed 2nd, above UC Berkeley and IIT. One Chinese blog mentions that the school is actually  bigger than most universities in China, and includes a science museum.

Wentao Weng, who ranked the #13 overall, says he first started learning how to code in what he calls “Junior 1,” which is 11-years-old. Wentao told us that computer science isn’t necessarily a standalone subject in grade school, but it’s well supported:

“It’s not one of the subjects; however, we can also try to become the one of the best coders among high school students to [get admission] into a good university,” Weng says. “So our teacher supports us in [studying] computer science, and we take some time on it. “And we have done many contests both online or offline [to] learn.”

He practices roughly 4 hours per day during school, but almost the whole day on weekends. His classmates have a similar work ethic. Cai Ziyi started coding at 12 years old. He says that most student programmers join the Olympiad in Informatics (OI) as an after school hobby.

***US_Leaderboard_list 

Zeroing in on the top 25 universities in the US, eight schools cracked the top 50 overall. Many of schools listed in our competition are in line with the US News & World report, except we surfaced a few underdogs. Schools that aren’t normally seen in academic rankings, like Ohio State UC Irvine and North American University, all ranked in the top 50 worldwide in the HackerRank University Competition.

While the traditional academic rankings, like the US News & World report, are one indicator of quality of education, it’s not the only place to find great coders. Great coders can come from any university in the world. In fact, as the students at San Yat-sun prove you don’t even need a degree to be able to code well.

*** Scoring:

* To calculate the score of a school in leaderboard, we take all participants from a particular school(M) in descending order of the students’ scores and calculate using the formula below. Note: The values for α and β for this leaderboard are 0.8 and 3 respectively.

Screen Shot 2016-12-19 at 8.10.32 PM

 

 

 

 

 

In order for a school to be listed on the School Leaderboard, the school must have at least 10 students submitting code in University Competition. Students are ranked by score. If two students have the same score, the tie is broken by the time at which the user finishes the first correct submission of the last challenge solved.

 

Girls Who Code: 3 Teen Sisters Crush Coding Records, Stereotypes

Chances are, 11-year-old programmer Mari Machaidze is growing up in a pretty different world than you and me. It might even be a better world, ingrained with the idea that programming is just another skill to be mastered through persistence with or without a Y chromosome.

If you were to tell Mari that girls don’t look like engineers, women can’t code as well as men, or women aren’t as competitive as men, she’d raise a skeptical eyebrow.

Mari can instinctively point to not one, but two sisters who would prove you wrong. Eighteen-year-old Elene Machaidze and her sixteen-year-old sister, Ani, routinely participate in coding competitions where they outperform thousands of men. Their home is decorated with medals from prestigious programming tournaments, like the International Olympiads in Informatics (IOI).

medals

The three sisters live in Georgia, a European country with a population equal to just 11% of the state of California. They’ve gained quite a bit of local fame for coding circles around their opponents. They’re born into a household that’s worked hard to fabricate a more equal world, with supportive parents, teachers and mentors who instilled confidence in them at an early age.

Thanks to her eldest sister, Elene, who first lit up the path to programming, Mari and Ani have a strong, successful role model and mentor to guide them through the male-dominated field.

We sat down with Elene to learn more about their story and how they achieved so much at such a young age.

So how long have you all been coding?

Mari just started learning programming last year. Ani has been coding for four or five years now, and I started coding when I was in sixth grade. I joined a programming club called Mzuiri. I just graduated from Komarovi school, which focuses on math, physics and computer science. Ani is going there now, and Mari will go there next year.

What drew you all to coding?

Our parents actually went to Komarovi school too. My dad is a programmer, and he works at a bank as a security analyst. We were exposed to math and computer science at a very early age, and we all love coding and participating in contests just for fun. I do want to major in computer science, and eventually work as a programmer like my dad.  

Tbilisi,_Georgia_—_View_of_Tbilisi

How many programming contests have you competed in? And how many medals have you won?

I’ve participated in tons of contests and olympiads. But the most significant ones were:

  • IOI
  • CEOI
  • IZhO
  • GeOI
  • Google Code Jam
  • HackerRank Women’s Cup
  • Facebook Hacker Cup
  • USACO
  • COCI

There were more too. I’ve won 2 bronze medals at IOI, 1 bronze at CEOI, 2 silvers at IZhO. Mari, Ani and I competed in HackerRank Women’s Cup as a team last year, and we ranked third place! Some companies that sponsored the event even sent us a letter after the contest, but I had to tell them that we’re too young right now to work for them.

hackerrank

I might call them when I’m a student or graduated. I’m applying to colleges. I took a gap year after high school, and I was actually teaching programming to 7th to 9th graders. I often point my students to HackerRank challenges to learn how to code. It’s a great tool to supplement learning in a very hands-on way. I love how the problems are arranged on the platform. I’ve been using it for years, back when it was first called Interview Street.

Wow, that’s incredible. You’re getting job opportunities before college! And even 11-year-old Mari joined the contest?

Yeah, Women’s Cup was one of her first contests.

We all worked together as a team. I did most of the coding, but Mari and Ani helped me think through the problems.

It was a lot of fun, and we were really surprised we won 3rd place. It was an awesome feeling.

How many programming languages do you know? What is your specialty?

It’s funny, I actually started coding in Pascal in 6th grade. It’s such a useless language today, but that’s how I started. Then, I learned C++ and I’ve been coding in C++ ever since. More recently, I’ve been learning Python as well.

Do you ever feel like you’re treated differently in forums, discussions or by men in general? Do you feel like you have to prove yourself more so?

Some boys definitely think that they’re better than me just because I’m a girl. I might have felt bad about that years ago, but I don’t feel that way today. I’ve participated in many olympiads and competitions.

And even though there are many more boys than girls, I was one of the first few girls on the Georgian team in IOI and I was the second Georgian girl to win a medal.

The boys don’t say anything anymore. Generally, women are strong and I think more women should code.

Yes, we agree. And how do your sisters feel being one of the few female programmers? What advice do you give other girls who want to be great at solving coding challenges like you?

For coding challenges, like the upcoming Women’s CodeSprint, remember that if you get stuck, try to think outside of the box. I like to remember the 9 dots puzzle because it’s a great example of thinking differently.

For those of you who aren’t familiar, the 9 dot puzzle requires you to connect 9 dots by drawing four straight, continuous lines that pass through each of the 9 dots without lifting your pen. Most people think to connect the boundaries, which makes the puzzle seemingly impossible. The only way you can solve this is by drawing the lines outside of the square. Hence, thinking outside of the box.

Anyone can code well if they work hard and are willing to open their minds to solving problems differently.

the wild (1)As for my sisters, if a guy says girls can’t code as well as guys, then my sisters just say “well, my sister wins competitions.” Anytime anyone says you can’t code, it’s all the more reason to roll up your sleeves and work hard. Remember, if you work hard, you can achieve anything and prove them all wrong.

Want to practice and show off your coding skills like Elene?

Join thousands of women worldwide to participate in Women’s CodeSprint April 22nd, 2016.

For Anyone Who Has Been Turned Down by 38 Companies, 120 Interviews

Alibek Datbayev’s journey to helping build the future of travel at Booking.com


cutmypic
Nearly 38 rejections in the span of 2-3 months sounds rueful to the average person. But for great software engineers, such resilience is a common trait. All too often, great software engineers pass through traditional resume screenings and freeze during the whiteboard coding interview.

If you think about it, coding on-the-spot in front of 3-5 different people multiple times isn’t a great reflection of your coding skills. You don’t get to use to your own IDE, you have an absurdly limited amount of time and you’re in an incredibly high-pressure environment.  

Job interviews are inherently difficult. But for Alibek Datbayev, landing a new job proved to be a test of a whole new level of willpower. Not only did he interview at about 40 companies, each of those companies had 2-4 rounds of interviews.

In about two months, he calculated a total of 120 rounds of interviews, resulting in 10 final rounds and 2 offers.

Here’s the thingDatbayev is an exceptional coder. He’s not only built geo apps, online ticket booking systems and an online ecommerce store from scratch, but also worked on cutting-edge new tools like back-end reward points systems and developed the largest blogging platform in Central Asia. But, like many coders, crushing job interviews just wasn’t one of his strong suits. Job interviews are difficult by design.

We sat down with Datbayev to learn more about his journey navigating through over 100 job interviews, and finally achieving a highly coveted opportunity of helping to build the future of travel at Booking.com.

So, how did you get to where you are today?

I’ve always been passionate about coding, starting from my early days at Olympiad teams in high school and ACM teams in college. This involvement, and consistent practicing, has really helped me master my technical skills.

I’m originally from Kazakhstan, where the tech scene is burgeoning, but it’s of course unparalleled to Silicon Valley. I’ve always dreamed of going abroad to other tech hubs and build cool technology. I had the opportunity to do that in 2014 when I was referred to Ipsy, the beauty product retailer, helped me get a job in San Mateo. But after my Visa expired, I had to return to Kazakhstan for a couple months to find another job. That’s when I interviewed with about 40 companies.

Wow, you interviewed at 40 different companies. What was going through your head as you went through so many job interviews?

I mean, of course it’s tough. There were several reasons why the job opportunities weren’t working out. But I know that I’m confident in my skills. It was just a matter of time. Many of the companies I was interviewing at were just not the right fit. For other opportunities, I simply didn’t do well enough in the difficult coding challenges. Many other companies didn’t want to hire me unless they met me in person. This was difficult because I was in Kazakhstan, and there was a 9-12 hour time difference. I would often do coding interviews at like 1 AM or 2 AM.

For instance, I got into the final interview for an extremely high-growth human resources startup. That was exciting, and

I really thought I was going to get an offer. But, eventually, the last round of the interview was super hard. I just failed.

But I just kept going because I knew it would happen eventually. I have the right skills, but it’s hard for companies to see that easily in the way most coding interviews are set up.

amsterdam

 

And how did you succeed and land a job at Booking.com’s engineering team?

Booking.com was hosting an online coding competition through HackerRank in September 2015, and I entered the contest. This changed my life.

I wouldn’t be here in the beautiful city of Amsterdam, where Booking.com HQ is based, if it wasn’t for this CodeSprint, or online hackathon.

I actually didn’t particularly score very high on those challenges (editor’s note: his rank was 305/435), but since I opted into the job opportunity after successfully passing the phone screen technical interview, the recruiters and engineers invited me for an onsite interview and they liked the way that I approached the problems.

This interview process was great because I was able to get my foot in the door by in just a day, in my own computer from my own home.

Booking’s culture is all about opening doors to the best talent internationally. So, after a couple more interviews that focused on culture fit, they decided to relocate me, which was incredibly helpful. So, I just started working on the engineering team at the headquarters about two months ago. I’ve been loving it so far. I’m really happy I participated in Booking’s online hackathon, and I’m grateful for that opportunity!

Any advice for other people who are struggling to succeed at coding interviews?

Even though algorithmic challenges aren’t really used on the job much in production, it’s still really important to keep revisiting your fundamentals. It’s just like a muscle–if you don’t train it, it’ll become weak. Keep practicing code challenges, and don’t give up. If you fail 10 interviews in a row, go for the 11th interview. But take a look at all the variables, and see if there’s anything you can do differently to improve. Take the pressure off, and work through problems routinely to keep your muscle memory in shape.

At some point, I mastered my skills, and practicing code challenges helped me fill in spaces in my knowledge.

Want to practice your coding skills? Join 30 Days of Code

Why Should Senior Engineers Balance Trees in an Interview?

For nearly as long as companies have hired programmers, managers have asked engineering candidates to solve fundamental algorithm and data structure problems. And for nearly just as long, engineers have debated the validity of these challenges in job interviews (2005, 2015).

The argument is: If I’m never going to balance a tree on the job, why would you ask these fundamental coding questions to gauge my skillset? At first pass, this can be infuriating for senior engineers. Who’s going to remember basic tree-traversal from computer science (CS) courses when you’ve been using easier, faster standard libraries for years?

But what’s not as emphasized as often is the value of basic CS fundamentals for most roles. Everyone knows the best strategy for screening candidates is to test for whatever’s important for the job, but simple algorithm questions actually play an important role in uncovering what engineers can and can’t do. If you dig deeper, engineers who can’t complete basic algorithmic code challenges in an interview are actually less productive hires in the long run.

You Get Unqualified Hustlers with Quick Wins

If you don’t test for CS fundamentals, you’ll risk hiring programmers who are only good at gettings things done in the short-term. They can put together decent code using APIs and build a glowing portfolio. But if you ask them why their program works the way it does, they’d see opaque black boxes. It’s like they’re assembling parts together without a toolkit.

Over the past several years, there’s been a sharp boost in the number of APIs and standard libraries. For instance, Salesforce, alone, has over 3 million applications in its third-party app system. Look at the sharp rise in APIs in the last 10 years, according to the ProgrammableWeb

programmableweb_640

The uptick of these neat packages make it easy for programmers to get by without revisiting the fundamentals. And that’s fine if you just want to hustle, get a quick win and build a stunted product.

But most–if not all–accomplished programmers, from Donald Knuth to Ken Thompson, value the importance of knowing why code works in building revolutionary products. For instance, Knuth’s 1968 masterpiece The Art of Programming, was the first time coders could understand why algorithms work the way they do. “So my book sort of opened people’s eyes: ‘Oh my gosh, I can understand this and adapt it so I can have elements that are in two lists at once. I can change the data structure.’ It became something that could be mainstream instead of just enclosed in these packages.”

Testing for algorithms and data structures also tests for lifelong curiosity. Engineers should be “continually interested in keeping themselves up to speed, in revising the fundamentals and taking on intriguing programming problems. Those are the people I want to work with,” says Soham Mehta, CEO and cofounder of Interview Kickstart.

You Build Fragile Products

If you don’t test for CS fundamentals, it’s going to be really difficult for you to provide for your growing base of customers. When scaling out architecture, you have to understand how components work on a simpler, more fundamental level before applying them across multiple machines. If your engineers open enough logic-related bugs, you could lose valuable customer information or create bottlenecks, resulting in a slow customer experience.

This happened to Ben Sigelman, an ex-Googler who founded a company called LightStep, which builds monitoring and performance tools for developers of large distributed systems. He recently worked with a well-intentioned engineer who decided to use Redis for scalable, consistent and durable storage. But Redis is best as an in-memory data structure server and does not – and can not – scale well when placed into its “AOF” consistency mode. In that configuration, Sigelman says it’s much slower and less resilient than true distributed databases that append to cluster-level file systems. He makes a solid point:

“Formal CS training would have triggered a ‘too good to be true’ alarm, well before [the engineer] deployed it, and irrevocably lost user data in the process,” he says.

You End Up Reinventing the Wheel

If you don’t test for CS fundamentals, optimizing your codebase is going to take a lot longer than it should. Opponents argue that smart programmers use standard libraries to save time. Why reinvent the wheel when someone else has already solved this problem for you?

But, remember, we’re not asking advanced algorithmic interview questions because you’ll be writing algorithms from scratch on the job. We’re testing basic knowledge of fundamentals to ensure you’re not just relying on other people’s code, Stackoverflow or Google. Otherwise, when you need to scale and optimize, you’ll waste a lot of time trying to figure out optimal solutions. It’s not just about memorizing how to implement algorithms. Learning the trade-offs between algorithms is valuable in boosting efficiency. Simply testing candidates on knowledge of where trees fit in relative to sets or maps or linked lists is valuable in and of itself.

Gayle Laakmann McDowell, founder and CEO of CareerCup and author of Cracking the Coding Interview, offers a great example of what happens when a senior engineer doesn’t revise fundamentals:

“A more senior engineer building a parsing engine might not understand how she can leverage graph theory or trees. She could spend hours reinventing the wheel, only to come up with something less optimal in the end.”

It’s the same for debugging. The most efficient way to debug requires fundamental knowledge of how components behave with one another. Someone who doesn’t really know how things work might put in logging everywhere in hopes of catching errors by trial and error. A better way would be to systematically isolate issues by spotting patterns in the errors. You can only do this if you know the system and its algorithm.

It’s especially important if you’re not quite sure which specific tools you’ll need. If you’re building a long-lasting product, it’s crucial to test for timeless fundamentals that will be the foundation of future programs. “The breadth-first search algorithm, for instance, was invented in 1959 as the solution to the shortest path in a maze, but it’s still indirectly important to programmers today through some layer of abstraction. ” says Dr. Heraldo Memelli, who oversees all of the code challenges at HackerRank.

Programming tools come and go, but fundamentals are forever. The assumption that you don’t need to know CS fundamentals on the job couldn’t be further from the truth.

But the Interview Can’t be ‘One-Size-Fits-All’

Of course, you can’t rely on general CS questions—alone—to hire for every role. The coding challenges you select have to be appropriate for the role you need filled, and basic fundamentals are one bar that should be cleared.

Leo Polovets has had a lot of experience in designing great screening processes as the second non-founding engineer at LinkedIn and engineer at Google. He offers a solid example:

“For a backend candidate, you might give them a problem where they need lists, sets, and hashmaps, and you want to make sure they use the right structure at the right time. For a front-end candidate, a good question might be to asks them to do some basic DOM manipulation. These could be 10-20 line programs, but they’ll still reveal a lot about what the person can and cannot do,” Polovets says.

More Companies Should Prepare Candidates

The reality is, algorithm and data structure interview questions should be really easy—as long as you have some warning, good prep material and context for what interviewers are really want to see.

Algorithmic coding challenges aren’t designed to evaluate how well you think on your feet. In fact, if you’re pop quizzing your candidates on algorithms, you’re most likely turning away really great people who happen to test poorly. The best tech companies are preparing their candidates as much as possible to create a stronger, more successful talent pool.

Facebook invests in teaching an interview prep class for all of their candidates. They realize that senior engineers or folks who are self-taught will need to prepare. It covers exactly what kind of algorithmic coding challenges they plan to ask and explain why. Of the three phases of Facebook’s technical interview, one is called “Ninja,” which screens the ability to solve tough coding challenges, like sorting algorithms. Any engineer who applies to Facebook has to do really well on these interviews. It’s one of the key reasons why Facebook has a world-class engineering team.

The assumption that you don’t need to know CS fundamentals on the job couldn’t be further from the truth for most jobs. Well-designed basic algorithm and data structures challenges are a good way to gauge depth of technical skills for sustainable products.


This article originally appeared on Forbes.


If you like what you see, please subscribe to our blog to get a quick note when we occasionally write thoughtful, long form posts.


To help hiring managers create better code challenges, analyzed hundreds of code submissions and spoke with several interviewers to create this in-depth guide:

Step 0. Before You Do Anything

Step 1. Designing Impactful Challenges

1a. The Challenge Checklist

Step 2. Setting Expectations, Warming Your Candidates

Step 3. Calibrating After the Screening

 

Q&A with Gayle Laakmann McDowell on Acing Programming Interviews

Last week, the renowned Gayle Laakmann McDowell, author of Cracking the Coding Interview and CEO of CareerCup.com, live streamed a 1-hour Q&A session in honor of this weekend’s Women’s Cup, an all-female online hackathon. She offered a wide range of valuable insights on how to prepare for technical interviews. You can watch the full live stream on the HackerRank Live YouTube channel here.
Here are the top 12 highlights:

Q: Is it okay to use pseudocode in an interview?

Most interviewers are not okay with pseudocode–they really want to see real code. It’s a slippery slope. It’s one thing to say the candidate doesn’t have to write semicolons–but that’s not pseudocode. Candidates can start glossing over details that can be really important. Candidates can ignore what the datatype of the list is.

That said, when I say real code–I don’t necessarily care about every little detail. If you’re implementing a method on a binary search tree, you don’t need to write the whole node class. If you want to write pseudocode as an intermediate step before writing real code, that’s okay. Just be mindful of how long you’re taking to write that because it’s taking time away from your real interview.

Q: If you were to hire somebody, would you prefer someone who’s a fast-learner or someone with strong technical skills?

It depends on the role: How long am I hiring them for? How many resources do I have?

Personally, if I’m hiring at a larger company, my bias is almost always towards someone who’s very bright because that person will have good technical judgement even if they’re not as knowledgeable. If I’m hiring someone who’s going to be the first person to architect a system in a particular technology, I’m going to hire someone who is knowledgeable in that tech. Hopefully they’re also bright.

It’s why a lot of larger companies use algorithm interviews. They have the resources to train people, so they want to hire people who are bright.

Q: How can I be great at competitive programming? How can I improve?

There are two segments to competitive programming:

  1. Problem-solving
  2. Knowledge

Interviews usually focus on #1 and competitive programming focuses on #2. So, as far as knowledge goes, know your algorithms really well. Know more obscure things. Pick up algorithms book. Know your different data structures.

With problem-solving side, it overlaps with interviewing. Practice a lot. Get used to the habit of not knowing how to solve a problem and getting past it. One of my favorite techniques for solving a problem that you don’t know how to do is really rely on examples.

There’s a technique what I call is “do-it-yourself.”

Give yourself a large, complex example–something that your brain can’t immediately see the answer to–and solve it manually. You’ll find that your brain has great intuition on how to solve a problem rather than just solving it rather than guessing. Reverse-engineer your thought process and think about what you did to find these pairs or matrix, etc.

Q: What’s the best way to deal with interview pressure?

The more practice you get, the less likely you are to be nervous. Practice with mock interviews. Also, grab a buddy and take turns being on the other side of the interview. Practice interviewing and you’ll find that a lot of what you thought the interviewer was doing to intimidate you or throw you off is just how interviewers naturally act.

During the actual interview, the way that you think you’re doing has no correlation to actual performance. People think they did well but do poorly. Or vice versa. I’ve seen this all the time. The reason of this is because when I’m evaluating you, I’m not evaluating you on whether or not you got the question right. I’m evaluating you compared to other people.

So, you might struggle on a question, but others are struggling too. You can’t say that you’re doing poorly because you don’t know if someone else is doing worse than you.  

Q: Do you think algorithmic questions are effective?

Great question, it’s one I get a lot. They’re pretty good at determining intelligence. I’ve seen over and over again that when you throw a smart person a problem, they tend to show really good judgement. That assumes they have some baseline of knowledge, which algorithm interviews assess.

I do think they have their flaws, which is true about every interview process. Realize they all have their problems.

One of the most important factors in how successful someone will be is to work hard and focus on their work and working with other people. Work ethic is really hard to interview for regardless. We do the best we can but no interview process will be perfect.

Q: What’s your favorite interview question?

Great question. I usually rotate after I get bored of one. Currently my favorite question is:

You have 2 string, one smaller, one larger…write an algorithm to figure out how many permutations of the smaller string exist in the bigger string.

It has a lot of different steps to it. There’s a lot of ways to tackle it. Everyone can find a solution to it. If you can’t find any solution to it, I’m pretty sure I wouldn’t hire you.

Then, there are many other steps. There’s a brute force solution most people come up with off the bat. There’s major optimization you can make and then there are several other optimizations you can make down the road. I don’t like interview questions that rely on one key insight or have one hard part to the problem. You might have gotten lucky with that hard part of the problem, so it’s not consistent. I like things with a lot of other hurdles.

It also doesn’t rely on a whole lot of knowledge. No one I’ve interviewed lacked the knowledge to tackle the problem.

Q: What was the biggest turning point in your programming life?

There were two moments for me:

1. I got an internship at Microsoft as one of the youngest interns (I was 18 at the time). It was a huge moment for me. When you have a job at a prestigious company, it opens up a lot more opportunities down the road. That was really big.

2. The other side of that: Leaving Google. I spent about 3 years there. It took me a while to say ‘This isn’t what I want to do.” Part of it was that I had equity in the company, which in retrospect wasn’t much. But it seemed significant at the time. It took me a while to say no. This was in 2008. So, not many people had left Google and it was still the company to join. It was hard for me to turn my back on that.

I didn’t want to be a programmer all day everyday, I wanted to do something more. I love programming, it was a good company and all but I wanted to pursue my entrepreneurial passion.

Q: How should I prepare for Big O questions?

Most common questions you’ll see will be: You design a new algorithm and you have to state the runtime of that algorithm. First of all, practice is a really useful thing. But one thing I’d be careful of is recognizing that variables have meanings. When you say inserting into a binary search tree is Log n, it’s not inherently Log n. It’s log of the number of nodes.

If you call the number of nodes “n” then yes, it’s Log n. But you could also say “n” is the depth of the tree. So you need to be specific and careful about what your variables are and what they mean. I’ve seen this mistake a lot. Be very careful of your variables. You’re better off not using “n” as a variable name at all. Instead, use logical variable names.

Also, be cautious of when you multiply and when you add. So, when you have two different steps of an algorithm and you do one thing and you’re all done. And then you do another thing, that’s an addition. When you walk through something and say every time you do this, you do that, that’s multiplying. That’s another mistake people make a lot.

Q: What’s the difference in interviewing process between Google and Facebook?

The interview process is far, far more similar than they are different. Yes, there are some differences, but they’re largely the same.

The first difference is that Facebook is usually more structured. People have a clearly defined role. E.g. for software engineer, you have an algorithm interview, design interview and culture fit/coding interview. It’s well-defined. Interviewee knows what they’re walking into.

The other difference is that Google tends to be more generalist in their hiring philosophy; Facebook has a little bit more recognition of specialties. At Google, there’s more of a tendency to hold a front end developer to the same expectations as a back end developer. Whereas Facebook is more likely to recognize: “Hey, you’re a front end developer, we really want that skill-set. We might lower the expectation for some of the more generalist algorithms.” Ultimately, you don’t have to know much less, but when comparing you to other people, they don’t expect quite as much.

Again, they’re more similar than different.

Q: Is there any interview advice you’d give women and not men?

The advice I’d give to a man is typically the same that I’d give to a woman. There are some small differences. I’ve seen more women than men be particularly nervous in interviews. And be less willing to advocate for themselves. That’s one thing I often push on women than man. It’s okay to be clear about your accomplishments. I’ve seen a lot of women are nervous about attributing their accomplishments to themselves and they’re quick to giving credit to their team. Don’t take credit for your team’s stuff, but don’t give credit for what you did.

Be aware that you shouldn’t be talking about “we the team” constantly. Attribute to yourself. That’s advice that goes to both women and men, but I’ve seen this problem more in women. I’ve had women be worried about coming off as arrogant. I haven’t seen as many women that have come off arrogant as men.

Q: What skills do you need to land a job at top tech companies, like Google?

There are two parts to this:

1. Prove to the recruiter that you have the right skills.

2. Prove to the recruiter that you can pass their interview.

If you have the right experience to do well in the job and in the interview because ultimately if your resume looks great, but you can’t pass the interview, then I won’t interview you. The best thing you can do is develop a lot of projects. Have a project on your resume. Do a lot of coding. Learn a lot of programming languages. Don’t get tied to one programming language. It can actually be a little bit of a red flag for some people.

If the candidate’s all about Java, people tend to have a bad vibe from that. Identify yourself as a great software developer, and not someone who’s just great at one programming language.

The other side of the equation is demonstrating that you can do well in an interview. Demonstrate that by showing that you know algorithms and data structures. One way to do that is with a computer science degree. If you don’t have a CS degree, show that your’e taking classes on Coursera or doing things like taking coding challenges on HackerRank. Those are actually really effective.

When I’ve reviewed people who have done coding challenges like that, I put them in the “yes” pile because I know that they are interested in algorithm challenges, which is often a good sign that they’re knowledgeable about algorithms. It’s worth giving them a chance to interview.

Q: What’s the best Github project to get noticed?

I get this question a lot. The best thing to do is just do projects. People worry about little details but what really matters is that you’re doing something. If its open source, great. If it’s your own iPhone app, great. If it’s web app, great. An individual team or company might want one thing more than another. For the most part, it doesn’t matter that much. The most important thing is you’re coding

That being said, open source has its advantages: You can what it’s like to work in a large code base. But it’s hard to attribute what exactly you did. Whereas if it’s your own iPhone app, it’s clear what you did.  Above all, pick something you’re excited about so you’ll enjoy it more….because what matters is that doing something.

Why Renowned Googler Ahmed Aly Chose HackerRank

He’s been recognized as the top 2% best coders in the world. He’s ranked #1 – #3 in prestigious competitions 4 years in a row in the Arab region before evolving to the judge level.

Ahmed Aly consistently has LinkedIn InMail from all of the best tech companies in Silicon Valley as attempts to poach him away from the almighty Google.

“HackerRank’s Vivek [Ravisankar] was the only message I replied to because I really like what HackerRank’s doing. And I think this is the field where I can do my best. I can do what I love doing,” he says.

After running Google Jam, an online coding competition with over 56,000 registrants for several years, he wants to help HackerRank take the coding community to another level.

Answering a Calling for Competitive Programming

When Aly serendipitously discovered the rip-roaring world of competitive coding at 20-years-old, all bets were off. With each new challenge under his belt, a dose of passion for the competitive sport pumped into his veins. A lifelong tunnel formed around Aly’s mind with a singular goal: Dominate coding competitions.

He started skipping classes for competitions. Grades started falling. He was supposed to graduate from Cairo University a year early, but he lost that head start. Even after graduation, he quit his software engineering job to focus on his last ACM regional contest training.

Nothing could stop him from pursuing this newfound passion for enduring new challenges. And it never would have happened if it weren’t for a college friend who happened to back out of an ACM challenge last minute.

“You need 3 people to enter, so my two friends just needed a body–anybody–to be able to enter, so I just joined,” he says.

It’s funny what happens when you’re open to new experiences. This small favor to a friend turned into a lifelong passion.

Those of you involved in competitive coding know that starting at the age of 20 is actually considered ancient. We occassionally receive fan mail from 10-year-old aspiring programmers who love solving challenges. Since most coders start in grade school, Aly was pretty late to the game.

But that only made Aly more determined. Back in 2007, there weren’t too many resources for folks trying to learn competitive coding strategies–fast. So, Aly built his own.

Aly is most famously known for creating A2 Online Judge, a website that aggregates ample coding challenges from 14 different coding contest websites. It offers the opportunity for passionate coders to create problems, use valuable resources for practicing, a community of people to chat with and much more.

He single handedly built this community as a hobby while holding down school and then full-time job upon graduating. He blasted through a million lines of code for his baby, A2OJ, with its servers still maintained in his garage. The community’s grown to over 30,000 registered users and 600,000 page views per month.

He might have been relatively behind compared to some of the other coders who started while he was playing video games, but he raced ahead and dominated the industry as becoming a judge for numerous prestigious competitions, including ACM and Google Jam. 

Goodbye Google, Hello HackerRank!

“I joined HackerRank because I used to do

For the last 3.5 years, Aly has been working on Google’s coveted search algorithm for 80% of his time. In the other 20%, Aly ran the entire Google Code Jam operation.

But Aly’s undying passion for competitive programming swelled enough to make him grow restless. He wanted to turn that 20% into 100% of his time.

“When I work on A2OJ or Code Jam, it didn’t feel like work. I never get tired,” he says. “I did this because I want to do what I love. If you do what you love, then you’ll always be good at it.”

The HackerRank team is thrilled to welcome Aly to the team. We’re excited to infuse his expertise into our larger vision of transforming the paradigm of tech recruiting. The melding of his brilliant mind and passion will help HackerRank innovate by turning coding challenges into the default, 21st century resume.

We also sat down with Ahmed to reflect about his achievements thus far. Hear from the master himself:

Ahmed, what advice do you have for people who want to become great programmers like you?

Don’t try to solve harder problems unless you are really good at solving the easier ones. That means solve a lot of really easy problems (that could be hundreds), that will improve your coding skills, which should be the easiest skill to gain. Then go to little bit harder problems, and so on.

Who was pivotal to your success?

master

The Great Fegla, that’s how we call him, his real name is Mohamed Abdelwahab, he was my coach for the ACM ICPC competitions. He isn’t just my coach, he is my friend and like my brother. I learned a lot from him, and I’m still learning. He changed me from a loser student who fails in many courses to a good student and programer. More details here.

“Practice by solving a lot of problems; then compete against the problem set, not against the other teams,” he often told me.

What’s something most people don’t know about you?

People usually don’t believe me when I say this, but while studying in college, I failed in the following courses (got less than 50% grade): Computer Programming 1, Computer Programming 2 and Data Structures.

 

Want a chance to chat with Ahmed? Sign up for World Cup, the ultimate university CodeSprint. The winner gets to video chat with Ahmed Aly!

The Unhealthy Obsession with Tree Questions

Why do engineers love to ask fundamental linked list and tree questions in interviews when you rarely code these problems in real-world development?

It’s evolved into a rite of passage. Every engineering candidate, from fresh-faced grads to authors of crucial open source contributions, solves fundamental data structure problems on the spot for interview screenings.

It’s how it’s always been done. But it makes sense. This ritual has sustained itself over the past few decades because it’s a fast, reliable way to spot smart candidates who can think deeply. Plus, it’s better to hire for ability to solve timeless fundamental problems than hire for knowledge based on transient tools. Hence, each of the top 10 technology companies in the Fortune 500 have asked engineering candidates core computer science concepts, including tree or list-related programming questions within the last few years:

Tree obsession

When front-end developer Stephanie Friend graduated from Cal Poly with an engineering and liberal arts hybrid degree, it had been a while since she sat in a lecture hall to learn about linked lists. It’s a good thing she blew the dust off her old data structure books and practiced challenges online before interviewing at one Silicon Valley startup in May this year:

“I had an interview with 6 different engineers on the same team, and 5 out of 6 interviewers asked me to solve a different linked list problem for a web development position,” Friend says.

So, why the need to ask 5 different linked list questions on 5 different occasions for 1 company? Some argue that you can’t be a great programmer unless you have these fundamentals down pat. Others say that CS fundamental knowledge is a good predictor of other useful programming knowledge.

It’s why most programming interview prep books, in even as early as the 2000s, have chapters dedicated solely to basic data structure and algorithm problems (e.g. 1, 2 and 3). Plus, data structure and algorithm questions make up the bulk of upvoted questions on CareerCup, a job prep community. You might be a little puzzled why we’re criticizing these questions, considering tree and linked lists challenges are some of the most popular on our own HackerRank platform.

But there’s a big flaw with companies that aren’t preparing candidates sufficiently before an interview and then relying solely on academic CS fundamentals to weed out unqualified candidates. Data structure and algorithm fundamentals are just one part of what makes a great engineer. Depending on the need, managers should also look at other crucial components, like technical experience, hard-to-acquire knowledge, design and debugging skills to comprehensively assess a candidate.

While fundamentals are crucial, using data structure questions as the be-all end-all filter for great programmers can be detrimental for talented engineers who don’t have CS degrees or who earned their CS degrees several years ago. By placing a heavy emphasis on fundamental knowledge–without properly preparing candidates–companies can create a bias toward recent CS graduates. As a solution, interviewers need to empower candidates with preparation material to reduce the number of great programmers who are rejected.

The Real World vs. Programming Interviews

A typical programmer, even at a top tech company, would rarely implement a data structure like a binary tree from scratch. So, many devs might be out of practice with this by their next interview. The most famous recent example is Max Howell, the author of HomeBrew, a celebrated  program management system for Macs. This year, Howell applied for an engineering position at Google and was rejected because, as he claims, he couldn’t “invert a binary tree” during the initial interview.

//platform.twitter.com/widgets.js

While we can’t definitively say why Howell didn’t get the job (it could have been a number of factors that interviewers don’t reveal), it’s likely that he could have performed better if he had known to brush up on those fundamentals by practicing online. After all, he’s an extremely accomplished and smart engineer. So, there’s a chance he was a classic false negative candidate.

The obsession that top tech companies have with data structure problems can also be unfair to engineers who never sat in a CS class a day in their lives. If you don’t have a CS degree, it can be difficult to gauge how much you need to know to clear the initial bar during interviews.

“While these questions can help select talented developers, they become highly problematic when somebody doesn’t have proper tools to prepare and doesn’t understand what is expected. A candidate might feel they need to read an entire algorithms book, which wastes their time and results in less time actually practicing problems.” says Gayle Laakmann McDowell, tech hiring consultant and author of Cracking the Coding Interview.

Although many great candidates get rejected because they failed to adequately prepare, it’s actually not that hard for smart developers to learn or re-learn the fundamentals. It’s part of why companies are fine with requiring them. Today there are a host of online resources to help you practice data structure questions. McDowell, who’s passionate about teaching programming, once successfully taught a student the required basics in just 2 hours. Another one of McDowell’s students was a self-taught programmer with a degree in music and learned and practiced enough of the fundamentals to land a job at Facebook.

But not everyone’s fortunate enough to have McDowell coach them individually. Self-taught students and experienced programmers are left to fend for themselves, leaving many of them annoyed and confused about the purpose of such fundamentals at dream company interviews, like Google and Apple (complaints evident here, here and here).

The best companies also test for other important factors that make a great engineer. For instance, discussing a technical project that a candidate is proud of can reveal knowledge, passion and ability to communicate well. Again, fundamentals can be very easy to learn if you know how much to prepare.

The Root of the Obsession

Since the initial boom in software engineering back in the 1980s, data structure and algorithm questions have been the common way to test candidates. The earliest engineers with growing teams carried CS degrees, and they knew that algorithm classes were a great place that required deep thinking. So, engineers of the 80s created this interview process that resembles algorithm classes. And it works accurately enough. When searching for talent, these questions are fast enough to answer in less than an hour and help interviewers gauge a programmer’s intelligence. It’s certainly not the only way to filter candidates for smartness, but–again–it works well enough.

McDowell theorizes that there might be another reason why companies expect knowledge of data structures like linked lists and trees: It’s hard to find enough algorithm questions that don’t involve these.

“Companies test algorithmic problem solving skills because they believe that people who are smart will generally do good work; they’ll find good solutions, write good code, and so on. I suspect companies continue to expect knowledge of data structures like linked lists and trees (which developers rarely directly use) because it’s hard to find enough algorithm problems that don’t cover this knowledge. And, since enough people have CS degrees, and it’s easy enough for those who don’t to learn this material, it creates a pattern where it’s okay to expect that knowledge,” McDowell says.

Companies deem this system effective because successfully answering algorithm questions are a positive indicator of success on the job. As McDowell says, it means they’re smart and they’re likely to do better work.

However, companies that don’t prepare candidates well enough aren’t giving them a chance to perform well at these fundamental questions. Most companies recognize that some good candidates will be rejected through these questions, but they’re okay with the drawback of missing out on good candidates. They figure, it’s better to reject a good candidate than hire a bad one. The veteran engineer Joel Spolsky, and author of the Trello software, once penned this common hiring philosophy in detail back in 2004:

“It is much, much better to reject a good candidate than to accept a bad candidate. A bad candidate will cost a lot of money and effort and waste other people’s time fixing all their bugs. Firing someone you hired by mistake can take months and be nightmarishly difficult, especially if they decide to be litigious about it. In some situations it may be completely impossible to fire anyone. Bad employees demoralize the good employees,” Spolsky says.

There might be some validity in the cost per bad hire in Spolsky’s outlook, but rejecting too many good candidates will dramatically increase the cost and time to hire — and ultimately, restrict company growth. All companies should be concerned with this.

Interview Prep is Not ‘Cheating’ and Necessary to Fill Empty Positions

About 10 years ago or so, companies were somewhat more wary about giving candidates preparation material before an interview. It was sometimes considered taboo or even “cheating” because they worried candidates might memorize problems and regurgitate knowledge in the interview.

But that mindset has slowly started to shift as the shortage of talented developers has intensified. In a 2013 survey of over 1,500 senior IT and business executives, more than a third identified availability of talent, employee turnover and labor prices as a business concern.

“Jobs postings will be listed for months without finding a good candidate,” former Zynga software engineer and founder of Appurify, Rahul Jain told TechCrunch.

Given these concerns, it’s actually in a company’s best interest to help candidates with interview prep by giving candidates a chance to practice solving CS fundamental problems. It’s simple: Better prepared candidates lead to fewer false negatives. Plus, most engineers can easily distinguish between someone who’s just memorized answers and someone who can truly solve a hard problem.

The best tech companies realize that it’s actually beneficial to both engineers and companies to give candidates a fair opportunity to put their best foot forward. This is especially true given the obsession with fundamentals that most engineers don’t usually revisit since the good old college days. It’s also a good way to skip the anxiety-ridden phase of the interview and get to other meatier questions that are just as important in assessing candidates, like culture fit and collaborative skills.

Googler Steve Yegge is one engineer who realized candidate preparation is an effective solution to the talent shortage early on. Back in 2008, he “secretly” blogged engineering interview tips for Google candidates in hopes that more of his interviewees would succeed:

Time passes, and interview candidates come and go, and we always wind up saying: ‘Gosh, we sure wish that obviously smart person had prepared a little better for his or her interviews. Is there any way we can help future candidates out with some tips?’

Google doesn’t know I’m publishing these tips. It’s just between you and me, OK? Don’t tell them I prepped you. Just go kick ass on your interviews and we’ll be square….

As late as 2008, people were so against offering candidate prep that Yegge even considered publishing his tips under a pseudonym to avoid upsetting people. Ultimately, his desire and need for better prepared candidates outweighed the risk of earning negative sentiments. This is also a huge reason why McDowell also left Google to start her empire of interview prep about seven years ago. As a software engineer at Apple, Google and Microsoft, McDowell interviewed one too many ill-prepared but smart candidates. She wanted to teach and help more engineers become better interviewers; thus, CareerCup was born.

The best tech companies are starting to realize that the more preparation, the better interview-to-hire rate. For instance, today Facebook hires McDowell for a weekly 1.5 hour class for candidates exclusively for interview prep. She walks Facebook candidates through the problems and even offers tips along the way. Facebook found so much success in this recruiting strategy that they doubled the frequency of her class. Today select top-tiered tech companies, like Pinterest, Google, Airbnb and Twitter, send at least an email that points candidates to resources for better preparation and practice for fundamental CS challenges.

More companies should look at why they’re rejecting great candidates and how they can reduce false negatives to help grow their team more successfully. Empowering smart candidates by setting more realistic expectations for candidates about the interview process is one way to accomplish this.

This decades-old process of testing engineers’ intelligence through fundamental CS questions may be sufficient to identify great programmers. But this process should come with a mechanism (at the minimum an email with links to resources) to help candidates practice these fundamental challenges for the interview. By helping candidates prepare, companies can more easily identify great developers and reduce the bias against older and nontraditional candidates. They can also focus on other important components that are crucial in evaluating strong engineers. This ultimately reduces hiring costs and fuels company growth — a win for the company and the candidate.

 

Do you prep your candidates before quizzing them on trees and linked list questions? 

The Risky Eclipse of Statisticians

If statisticians have historically been leaders of data, why was there a need for a brand new breed of data scientists?  While the world is exploding with bounties of valuable data, statisticians are strangely working quietly in the shadows. Statistics is the science of learning from data, so why aren’t statisticians reigning as kings of today’s Big Data revolution?

In 2009, when Google was still fine tuning its PageRank algorithm based on the statistical innovation Markov Chain, Google’s Chief Economist Hal Varian declared statistician as the sexiest job of the decade. We’re about halfway through, and it seems that Varian missed the target.

“Professional statisticians are milling at the back of the church, mesmerized by the gaudy spectacle of [Big Data] before them.” – David Walker, statistician, Aug 2013.  

Google Trends shows us that while the popularity of Big Data is thriving, statisticians’ popularity has been declining over the years. Back in 2010, predictive modeling and analytics website Kaggle proudly dangled Varian’s prediction as a carrot on their careers page to lure people to join their team. But today the quote curiously vanished–no longer deemed worthy.

Screen Shot 2015-07-15 at 7.06.12 PM

What speaks even louder volumes is that statisticians are often left out of some of the biggest national discussions happening around Big Data today. For instance, UC Berkeley’s Terry Speed observes:

  • US National Science Foundation invited 100 experts to talk about Big Data in 2012. Total number of statisticians present? 0.
  • The US Department of Health and Human Services has a 17-person Big Data committee. Total number of statisticians? You guessed it…0.

Justin Strauss, co-founder at Storyhackers, who previously led data science programs in the healthcare industry, can attest to this more generally. He says he has “seen an underrepresentation” of statisticians at conferences and other events related to Big Data. But statistics is the foundation of understanding Big Data. This was supposed to be their decade–their time to shine in the limelight. So, what changed? As renowned statistician Gerry Hahn once said:

“This is a Golden Age of statistics, but not necessarily for statisticians.”

Instead of crowning statisticians king, the Big Data revolution borrowed the foundational elements of applied statistics, married it with computer science and birthed an entirely new heir: The Data Scientist. But this underrepresentation of statisticians puts the future of Big Data at risk. The accurate evaluation of data that comes from a strong foundation of statistics could be lost in the hype.

Why Didn’t Statisticians Own Big Data?

Plenty has been written about the recent rise of data scientists, but the application of data science to the industry is ancient. In the 1900s, statistician William Gosset studied yeast for the Guinness Brewing Company and invented the t-distribution in the process. Statistician Kaiser Fung points out that one of the most notable examples of a business built upon statistical algorithms came decades before Google. Fair Isaac Company introduced the analytics of credit scoring in the 1950s. Not to mention the US government has been performing census calculations with incredible precision for hundreds of years as well.

There are three plausible reasons why statisticians aren’t leading Big Data today. First, computational statistics of Big Data never flourished in mainstream statistical sciences.

“The area of massive datasets, though currently of great interest to Computational statisticians and to many data analysts, has not yet become part of mainstream statistical science.” – Buja A. Keller-McNulty

This quote was published in 1999. And, a decade later, it never happened. Although early statisticians recognized and discussed Big Data, many of them were ignored. Speed points out that statisticians have published books and papers about the techniques of wrangling large datasets. But they collected dust, evident by the number of citations earned. For instance:

Screen Shot 2015-07-20 at 8.53.36 AM

Second, statistics is a crucial part of data science, but it–alone–is insufficient in making sense of exponential amounts of messy data we are producing daily. It requires computational power that can only be charged by the advanced technology we have today. In 2010, the world stored about 6 exabytes of data, a stat so incomprehensible that it’s borderline meaningless. For a frame of reference, if you converted all words ever spoken by humans into text, that’s about 5 exabytes! Here are some more quick Big Data stats:

Untitled Infographic (13)

Machine learning is deeply rooted in statistics, but few statisticians have the technical skills to manipulate a dataset of 10 billion in which each data point has a dimension of 10,000. But it’s not that statisticians lack computational knowledge. It’s that the field of statistics simply wasn’t equipped with the computing power we have today. For instance, data scientist David Hardtke lead the invention of the Bright Score, an algorithm that assesses your fit for a job, which was acquired by LinkedIn. But he says none of these ideas are really new. Back when he first started in the space, he met a senior researcher at Recruit Holdings, a japanese recruiting firm.

“He told me he’s really interested in what I’m doing because he tried to do the same thing in the 80s. He said, frankly, it was way too expensive back then. You had to buy these massive computers and it wasn’t cost effective,” Hardtke says. 

But now, we’re at this convergence of super cheap, high-speed computing that’s helping data scientists process powerful insights and find answers to questions that remained a mystery 20 years ago. With Big Data booming, pure statistics is fading into the background relative to the demand of data science.

Third, some statisticians simply have no interest in carrying out scientific methods for business-oriented data science. If you look at online discussions, pure statisticians often scoff at the hype surrounding the rise of data scientists in the industry. Some say it’s a buzzword with good marketing (here), other say it’s a made up title (here) and some call them folks who sold out to shareholders (here).

Statisticians’ Absence Could Lead to Misuse of Data

Even without a prominent presence of statisticians, educational institutions are churning out entirely new curriculums devoted to the so-called “new” field of data science in just the last few years.  But when dealing with Big Data, someone on the team needs to have a strong grasp of statistics to avoid reaching inaccurate conclusions.

The elevated hype about data scientists is undeniable. The WSJ reports that these jobs are so in-demand that data scientists with two years of experience are earning between $200,000 and $300,000 annually. It was dubbed the sexiest job of the 21st century in 2012. Universities are having to turn down data science students because of the outpour in popularity. As a result, there are at least a dozen new data science bootcamps that aim to prepare graduates for data science jobs. And universities across the nation are creating brand new courses and programs for data science or business analytics. Here’s a visualization thanks to Columbia Data Science:

columbia

But, as with any new curriculum, space is limited. This is where it gets risky. Ben Reddy, PhD in Statistics, at Columbia University finds that the foundation of statistics often takes a backseat to learning the technical tools of the trade in data science classes. And even if students are carrying out statistical models in classes, doing statistics doesn’t guarantee that you understand statistics. Since learning R or NumPy is usually the gateway to getting your hands on real-world data, understanding statistical analysis is often less interesting comparatively.

“Anyone who can type <t.test(data)> into R (not to mention <lm()>, <knn()>, <gbm()>, etc.) can “do” statistics, even if they misuse those methods in ways that William Sealy Gosset wouldn’t approve on his booziest days at the Guinness brewery.” Reddy writes. 

The worst part is, you can usually get away with carrying out subpar analysis because it’s hard to identify the quality of statistics without examining analysis in detail, he adds. And, usually, there’s not enough transparency to do this in the real-world. So, with the absence of statisticians in Big Data today, how well are the fundamentals of statistics carried over in this new data science boom? Most students haven’t even graduated from these brand new data science courses yet, so it remains to be seen.

But this risk in losing the fundamentals is largely why Hardtke, a physicist himself, is opposed to these new degree programs. He makes a compelling point: It’s better to have someone who’s really passionate about geology, physics or any other science because they’ll pick up the tools of data manipulation as part of a bigger mission.

“I’d rather have someone major to get some answer and learn the tools along the way rather than learn the tools as the terminal point,” Hardtke says.

But the Most Powerful Data Science Teams are Multidimensional

Folks outside of the space often don’t realize that the most astonishing achievements in data science weren’t accomplished by just one superstar, unicorn data scientist. When Hardtke was tasked with building a strong data science team at startup Bright.com several years ago, he couldn’t afford to recruit the best data scientist away from the likes of Google and Facebook. But he knew something most data scientist-crazed recruiters don’t understand: At its core, it’s all about learning how to ingest data using statistical methodology and computational techniques to find an answer.

Most scientific disciplines require this knowledge. So, he hired scientists across disciplines: physicist, mechanical engineer, statistician, astrophysicist–basically anyone who wasn’t a computer scientist or data scientist. The most successful, passionate data science teams in Silicon Valley comprise of a combination of different scientific disciplines that look at one problem from unique angles. It’s the only way to work through seemingly impossible problems in data science.

If you ask Nitin Sharma, for instance, about his data science team at the early days of Google, his eyes instantly light up. With experts from psychology, computer science, statistics and several other disciplines, Sharma’s diverse team offered perspectives from every dimension possible. Google’s head of search Amit Singhal once asked him: “How do you know if people are happy with the search results?” Tracking the simple act of clicks on links can’t determine whether or not the searcher was happy with his result. And so, the challenge was on for Sharma’s team.

“I can’t tell you the details of what Google did, but conceptually, we looked at what sequence do these clicks have? How much time they’re spending? How often do they refine queries? How often do they click on results? How many results? How does it depend on the type of query?” Sharma says. 

And, ultimately, Sharma’s team was able to work together to find a successful plan to monitor a user’s happiness, which offered deeper insight into search behavior and satisfaction with search results. While both data science and statistics share a common goal of extracting meaningful insight from data, the evolution of data science in the last 10 years emphasizes a demand for a combination of interdisciplinary skill.

Data science is making statistics–alone–irrelevant in industry. Hence, eclipsing statisticians, or fathers of data science.

On a scale of 1-10, Sharma says we’ve only inched maybe 1-2 in terms of progress in data science. With the forthcoming revolution of the Internet of Things, there’s infinite possibilities before us. The biggest challenge will be: How do we process and understand this unsurmountable data? The onus can’t be on “rockstar, unicorn” data scientists alone. And it can’t fall onto statisticians either. Although the demand for pure statistics will shrink relative to data science and over time, it’s going to be more important than ever to have interdisciplinary knowledge from a variety of fields. And to ensure quality and foundational understanding of applied statistics, it’s crucial to save a seat for statisticians at the Big Data table.

Have you noticed an underrepresentation of statisticians in Big Data? Tell us what you think in the comments below!


To get occasional notifications when we write blog posts, please sign up for our email list.

The Inevitable Return of COBOL

It’s only a matter of time until the Common Business Oriented Language (COBOL) will regain its spotlight as one of the most in-demand skills of future generations of software engineers. We can just see it now: Programmers of the future will hop out of their driverless cars, walk into their offices and sit down to start coding in 1959’s COBOL.

It sounds crazy, considering COBOL is the furthest thing from most engineers’ minds today. It ranks fairly low in the Tiobe Index, a measurement of today’s most popular programming languages. Many newer, speedier languages give today’s coders little reason not to scoff at the antiquated COBOL. The most telling evidence of COBOL’s irrelevancy is that about 70% of universities said they don’t even include COBOL in their computer science curriculum anymore, according to a recent survey. It’s logical. Why waste curriculum space for a skill that employers don’t even look for these days? A quick search for “COBOL programmer” on any job site, for instance, yields a few hundred job postings while the more popular “Java programmer” yields thousands.

Based on these facts alone, COBOL appears to be nearly extinct. You might even wonder why we’re writing about COBOL at all? But looks can be deceiving. COBOL is a mysterious paradox. Born in another era, COBOL lives on as the quiet but important pillar on which the majority of businesses stand today. In a field that evolves at an unprecedented speed, younger generations may be overlooking a critical skill of the future.

Out of Sight, Out of Mind

Those of you who are familiar with legacy systems know the widely cited stat: *70-80% of all business transactions worldwide are written in COBOL today. But what’s not emphasized often enough is there’s a slowly increasing gap between the number of massive institutions relying on COBOL and its relevancy among programmers today.

COBOL knowledge skill gap (3)

While newer programmers aren’t as interested in learning COBOL, its role in millions upon millions of mission-critical transactions from healthcare to travel can’t be denied. COBOL is written for mainframes created 10 years before man walked on the moon. Those same mainframes still operate some of the biggest institutionalized computing today. If someone pulled the plug on COBOL, millions of businesses worldwide would suffer from malfunctioning machines. This old programming language is a bit of a taboo today, but it’s important to recognize just how big of an impact COBOL still has on our day-to-day.  Here’s a quick breakdown of the biggest systems that run on COBOL:

COBOL impact

Why Haven’t We Replaced COBOL?

If COBOL’s been such a foundational root of business apps for decades, why hasn’t something better come along? The short answer: If it ain’t broke, don’t fix it. The long answer requires us to take a step back in time. COBOL was created in 1959, the computing era when programming languages were tailored for specific purposes. For instance:

  • Fortran → Scientific problems
  • Lisp → Artificial intelligence
  • Cobol → Business applications

As banks, insurance companies and government institutions started joining the computer age, they’d create programming languages specific to their machines. You can imagine how costly and time-consuming this was. There needed to be a universal business language to carry out business operations faster.

grace

Grace Hopper, the mother of COBOL, helped champion the creation for this brand new programming language that aimed to function across all business systems, saving immense amount of time and money. Hopper was also the first to believe that programming languages should read just like English instead of computer jargon. Hence why COBOL’s syntax is so wordy. But it helped humanize the computing process for businesses during an era when computing was intensive and prevalent only in research facilities.

And so a new committee, consisting of industry, universities and US government folks, formed to develop the much-needed language to help standardize business programming. The Department of Defense even decreed that all businesses must run on COBOL in the 1960s. One article by notable software engineer Robert L. Glass explains that COBOL does the 4 essential business tasks better than most modern languages today:

  • The capability for heterogenous “record-structure” data
  • The capability for decimal arithmetic
  • The capability for convenient report generation
  • The capability for accessing and manipulating masses of data (typically made up of heterogenous data structure).

“COBOL is either good or adequate in all 4 (except for database access and GUI construction, they were designed into the language from the outset), whereas the COBOL replacement languages, like Visual Basic and Java are good at few if any of them,” Glass says. And as language designers started to take more of a universal, flexible approach:

“The unique capabilities of COBOL and the business reasons for them were lost in passage of time.”

This was true as of 1997. So, at least until the late nineties, there really hasn’t been a successor that could carry out the massive batch processes as sturdily as COBOL.

But the even bigger reason not to rock the boat is the sheer size and cost of replacing billions of lines of COBOL that exist today. Many of these programs contain sensitive information about people, like social security numbers, banking info, credit card info and healthcare records. Creators of COBOL invested 2 trillion dollars for the universal language. Businesses worldwide run on over 220 billion lines of code today. It would be a herculean feat to replace every single business program with a brand new language without introducing detrimental bugs. Hence, the cost just hasn’t outweighed the benefits of replacing COBOL.

The Case for the COBOL Comeback

Although COBOL is currently out of sight and out of mind, businesses have to focus on restitching the antiquated fabric of their infrastructure…eventually. Consider the average age of the COBOL programmer today. One survey of IT managers from 352 companies finds:

Age of COBOL programmers

That survey was taken 9 years ago. In 2014, Micro Focus says that the average age of COBOL programmer is still about 55-years-old.

“Without a doubt, it is a challenge to find a developer in Cobol who is not nearing retirement age,” says Dale Vecchio, research vice president of application development at Gartner Inc. “In 2004, the last time Gartner tried to count Cobol programmers, the consultancy estimated that there were about 2 million of them worldwide and that the number was declining at 5% annually.”

COBOL programmers are starting to retire; meanwhile, there’s no interest from young programmers to take on COBOL challenges. So, what will businesses do to keep maintaining their mission-critical programs? COBOL skills will be in demand to reverse-engineer pivotal mainframes. And as time goes on, the market will need to correct this skill gap by boosting the value of COBOL programming skills, drawing more engineers to learning the language.

Some COBOL-heavy organizations, like IBM and Micro Focus, have already developed programs to promote COBOL in younger generations. So far, IBM has developed curricula in association with more than 80 colleges and universities. Companies in Dallas are also doing something about it as well. Dr.Leon Kappelman, professor of Information Technology at the University of North Texas told the Wall Street Journal:

“Four years ago, local Fortune 500 employers encouraged the university to offer Cobol courses. Now, graduates who take Cobol electives earn starting salaries of $75,000 compared to starting salaries of $62,500 for those who did not.”

Plus, many forums and online discussions about the language illustrate that…guys…it’s really not that bad. It’s important to note that, unlike modern languages, COBOL is not designed to be versatile. Once you adjust your expectation, COBOL isn’t as ugly. Some programmers enjoy the puzzle aspect of maintaining COBOL-based mainframes, like figuring out the exact line of code to fix an existing script. Others find the stable environment allows them to focus on the business aspects of the program. It’s why the few universities that offer COBOL courses mask it under “Intro to Business Systems Programming

As taboo as COBOL might be in the ping pong rooms of modern startup-driven culture today, its influence and irreplaceability will result in a spotlight on the dinosaur language again. Businesses must figure out who will maintain their mainframes when COBOL programmers retire in the near future.

If you ran a COBOL-oriented business today, what would you do to address the looming skill shortage? Tell us in the comment section below!


To get occasional notifications when we write blog posts, please sign up for our email list.

 

 

* The 70-80% stat on total COBOL business transactions is an estimate based on Gartner and Micro Focus sources. 

Livestream Recap: SnapDragon Shares Pearls of Programming Wisdom

On May 31st, the legendary, wise and experienced Derek Kisman (aka SnapDragon) graciously answered dozens upon dozens of questions posted by passionate hackers via HackerRank Live for 6 hours! What’s more, he did the entire live cast and coding on his Microsoft Surface Pro!

Some call him the mythical man. With over 20 years of programming experience, and a reputation as a self-taught award-winning top coder for many years, SnapDragon revealed insight into solving the toughest challenges from both ICPC 2014 and ZenHacks! SnapDragon has been a judge several times for the prestigious intercollegiate coding competition. Plus, he ranked #2 in the recent ZenHacks competition. Many of you live streamers requested a link to his solutions. As promised, here it is!

SnapDragon also talked about general programming advice for coders who want to get better at their programming skills. Hundreds of folks tuned in at a time, and his livestream video recording has garnered over 6,000 views to date!

We had a chance to catch up with SnapDragon to talk about the highlights of the event. Here are just a few pearls of wisdom from legendary SnapDragon.

Let’s get right to it! What’s your advice on solving the infamous ‘G’ problem from ICPC-2014?

Only 1 team was able to solve it. Teams shouldn’t be afraid to think about a problem just because they don’t see other teams solving it. G isn’t very difficult by comparison to the so-called “stoppers” of previous years. (It has a very short solution). [4:30].

Top 5 tips for ranking at the top of the ZenHack leaderboard:

1. You don’t have to reproduce the judge’s intended solution, you just need a successful submission.

2. For Composite Numbers, I didn’t know about the Lehmer method for computing pi(n), so I just went nuts optimizing a dumber solution.

3. For Efficient Journey, I didn’t quite figure out that much of the path ended up constant, but I got close enough using an O(log N) skip list.

4. For Introduction to Algebra 2, I just proved what little I could, brute-forced a few more examples, then submitted and hoped I was done. And it worked.

5. For Travel Trouble, I ran out of contest time and couldn’t implement the O(N^2) solution… but O(N^3) with some clever heuristics squeaked by.

Your favorite question that someone asked you during the live event?

When someone asked about solving the Rubik’s Cube. 🙂 I tried to explain a nice trick for solving Rubik’s Cube based on group theory conjugation (like this one).

Advice on programming interviews:

You wouldn’t believe how many people we interview who just can’t code. If you go through HackerRank, you have to be able to think through problems, which is what companies want to see!

On creating a Mental Model when approaching problems:

I don’t think my mental model has much to do with what language I’m coding in. Granted, I’m always working in C++, but my mental model is basically mathematics, not programming. So, when I’m looking at a problem, I’m thinking: What does that look like mathematically? Then, programming is the art of transforming your mental model into something in a text file in a way that maps it as closely as possible.

SnapDragon (1)

What does it take to be a world champion coder?

Math is really important. But you don’t need to be a world champion in math to be a world champion at programming. I’m sort of the example of that. “Calculus isn’t TOO important in programming contests, but you need the basics. Like integrals.”

What’s the best strategy to practice and learn?

I don’t think I ever picked particular algorithms and studied them. Back when I was on Unviersity of Waterloo ACM team, we’d practice regional problems. We’d spend 4-5 hours and pretend it was an actual contest. Afterward, we’d figure out where our deficiencies are. Was it a new algorithm? Was it bugs? Sometimes there’s a sentence in the problem that you missed.

Is competitive programming necessary to crack companies’ coding interviews?

It does seem like companies like to ask algorithm-style questions in interviews. I’ve never seen questions that approach the complexity of some of these later ACM challenges. Because, companies aren’t really trying to see whether or not you’re an algorithm master. They want to see how you think. They give you problems that are hard enough so they can hear your thoughts as you solve it. Going through HackerRank problems is a really good way to prepare for interviews for companies like Zenefits and Google. You wouldn’t believe how many people come in for interviews who just can’t code. They can’t write a program, compile it and run it.

So, if you practice on HackerRank a lot, you’ll do really well in all those standard interview questions. You won’t need to memorize a list of questions beforehand…you’ll be able to think on the fly. They’re easier than the hard challenges you’ll find on HackerRank or elsewhere.

On Dynamic Programming:

The problem with Dynamic Programming is that it’s not just a single algorithm…it’s a concept. I learned DP a lot from experience. It’s more of a generalized approach to searching. It’s the sort of thing you have to pick up by experience. The experience is being able to recognize these states can look like. You need these to be not an exponential number. You need them to be as small as you can get them. (36:43).

On Network Flow:

Most programming contests seem to have Network Flow these days. It tends to be considered a hard problem. It’s sort of cheating in that you have an exponential search. Your path is important and you have to remember your path at all times, but you can erase it. (38:50).

How did competitive programming help you solve the logic puzzle that was unsolved for 30 years?

There was a puzzle called Panex for which the shortest known solution was something like 30,000 moves. For 30 years, it wasn’t proven. I was able to come in and apply a very hard, very advanced form of dynamic programming to solve it — to actually prove that the min was 30,000.

Favorite book recommendation for math?

  • Godel Escher Bach (thinking about the world as a computer scientist)
  • Martin Gardner’s articles in Scientific American

Watch the full live Q&A video here.