An article was recently published that looks at evaluating First Programming Languages (FPL) the language to use for an introductory course of programming.
An existing issue is that formally assessing a programming language isn't really defined, with a lot of evidence being anecdotal. The proposed evaluation framework looks at technical and environmental feature sets. "The technical feature set covers the language theoretical aspects, whereas, the environmental feature set helps evaluating the external factors." These feature sets are covered in table 2 of the article (link to PDF) and consist of the following:
The article explains each of these points in details, and gives each of the languages being evaluated a rating based on this explanation, followed by a detailed explanation of how the scores of each rating can be compared this includes allowing an evaluator to weigh certain criteria they deem important against the others. As this is for choosing a language to teach someone to program with, different places will have different reasons and goals, so would want to weight things differently.
As the default weight settings do not conform to the original popularity index of the languages, so there should be a different weighting criterion. However, it is very hard to come up with a generic and correct weighting criterion. Therefore, the scoring function should be customizable and the user should be able to tune the weight of each feature based on her preferences. As an example, consider the fact that Ada holds 3rd position in overall scoring, but is not being considered among highly used FPLs as of now.
If you're dealing with large amounts of data I'd actually recommend you extend your dabbling with C++ to work, too - or at least to some other lowish level language, because reliance on the likes of Matlab and Python are going to be killing your runtimes. Dedicated C and C++ methods are easy to call from Python (give http://www.swig.org/Doc1.3/Python.html [swig.org] a look), which gives you the best of both worlds -- Python to set a problem up, and native code to get the performance boost. (Fortran works well with massive datasets since ultimately they're just big arrays and that's still one of Fortran's major strengths, but it's a bit of a pain to call from other languages, though you could run through C interfaces to get there which wouldn't be too shabby, so long as you were careful to invert your matrices along the way which itself comes with a penalty cost; depending on the situation you may well find yourself better sticking with C-like languages.)