Automatic Bug Repair

image

Computer Science at MIT has always been one of the most admirable thing. The researchers are back with a bang and have found a way to repair the bugs automatically. The following article was published on News MIT :

At the Association for Computing Machinery’s Programming Language Design and Implementation conference this month, MIT researchers presented a new system that repairs dangerous software bugs by automatically importing functionality from other, more secure applications.

Remarkably, the system, dubbed CodePhage, doesn’t require access to the source code of the applications whose functionality it’s borrowing. Instead, it analyzes the applications’ execution and characterizes the types of security checks they perform. As a consequence, it can import checks from applications written in programming languages other than the one in which the program it’s repairing was written.

Once it’s imported code into a vulnerable application, CodePhage can provide a further layer of analysis that guarantees that the bug has been repaired.

“We have tons of source code available in open-source repositories, millions of projects, and a lot of these projects implement similar specifications,” says Stelios Sidiroglou-Douskos, a research scientist at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) who led the development of CodePhage. “Even though that might not be the core functionality of the program, they frequently have subcomponents that share functionality across a large number of projects.”

With CodePhage, he says, “over time, what you’d be doing is building this hybrid system that takes the best components from all these implementations.”

Sidiroglou-Douskos and his coauthors — MIT professor of computer science and engineering Martin Rinard, graduate student Fan Long, and Eric Lahtinen, a researcher in Rinard’s group — refer to the program CodePhage is repairing as the “recipient” and the program whose functionality it’s borrowing as the “donor.” To begin its analysis, CodePhage requires two sample inputs: one that causes the recipient to crash and one that doesn’t. A bug-locating program that the same group reported in March, dubbed DIODE, generates crash-inducing inputs automatically. But a user may simply have found that trying to open a particular file caused a crash.

Carrying the past

First, CodePhage feeds the “safe” input — the one that doesn’t induce crashes — to the donor. It then tracks the sequence of operations the donor executes and records them using a symbolic expression, a string of symbols that describes the logical constraints the operations impose.

At some point, for instance, the donor may check to see whether the size of the input is below some threshold. If it is, CodePhage will add a term to its growing symbolic expression that represents the condition of being below that threshold. It doesn’t record the actual size of the file — just the constraint imposed by the check.

Next, CodePhage feeds the donor the crash-inducing input. Again, it builds up a symbolic expression that represents the operations the donor performs. When the new symbolic expression diverges from the old one, however, CodePhage interrupts the process. The divergence represents a constraint that the safe input met and the crash-inducing input does not. As such, it could be a security check missing from the recipient.

CodePhage then analyzes the recipient to find locations at which the input meets most, but not quite all, of the constraints described by the new symbolic expression. The recipient may perform different operations in a different order than the donor does, and it may store data in different forms. But the symbolic expression describes the state of the data after it’s been processed, not the processing itself.

At each of the locations it identifies, CodePhage can dispense with most of the constraints described by the symbolic expression — the constraints that the recipient, too, imposes. Starting with the first location, it translates the few constraints that remain into the language of the recipient and inserts them into the source code. Then it runs the recipient again, using the crash-inducing input.

If the program holds up, the new code has solved the problem. If it doesn’t, CodePhage moves on to the next candidate location in the recipient. If the program is still crashing, even after CodePhage has tried repairs at all the candidate locations, it returns to the donor program and continues building up its symbolic expression, until it arrives at another point of divergence.

Automated future

The researchers tested CodePhage on seven common open-source programs in which DIODE had found bugs, importing repairs from between two and four donors for each. In all instances, CodePhage was able to patch up the vulnerable code, and it generally took between two and 10 minutes per repair.

As the researchers explain, in modern commercial software, security checks can take up 80 percent of the code — or even more. One of their hopes is that future versions of CodePhage could drastically reduce the time that software developers spend on grunt work, by automating those checks’ insertion.

“The longer-term vision is that you never have to write a piece of code that somebody else has written before,” Rinard says. “The system finds that piece of code and automatically puts it together with whatever pieces of code you need to make your program work.”

“The technique of borrowing code from another program that has similar functionality, and being able to take a program that essentially is broken and fix it in that manner, is a pretty cool result,” says Emery Berger, a professor of computer science at the University of Massachusetts at Amherst. “To be honest, I was surprised that it worked at all.”

“The donor program was not written by the same people,” Berger explains. “They have different coding standards; they name variables differently; they use all kinds of different variables; the variables could be local; or they could be higher up in the stack. And CodePhage is able to identify these connections and say, ‘These variables correlate to these variables.’ Speaking in terms of organ donation, it transforms that code to make it a perfect graft, as if it had been written that way in the beginning. The fact that it works as well as it does is surprising — and cool.”

Google will be able to cure Cancer

image

In a blog post released Wednesday, Google noted, “In order to scale up by the next order of magnitude, Broad and Google will work together to explore how to build new tools and find new insights to propel biomedical research, using deep bioinformatics expertise, powerful analytics, and massive computing infrastructure. Collaboration between the world’s premier genomics and biomedical research center and the most advanced computing infrastructure can help develop a new generation of tools and services that will enable scientists — from large academic institutions, commercial organizations, or small research labs in remote corners of the world — to uncover a wealth of biological insight.”

Google Genomics was begun two years ago “to help the life science community organize the world’s genomic information and make it accessible and useful,” and now, as the program forms new partnerships, it is breaking new ground and discovering new potential applications. And as growing amounts of data is gathered, the cloud is becoming more and more necessary as a storage unit for the veritable treasure trove of knowledge produced by researchers across the world.

“Large-scale genomic information is accelerating scientific progress in cancer, diabetes, psychiatric disorders, and many other diseases,” noted Eric Lander, president and director of Broad Institute in a statement. “Storing, analyzing, and managing these data is becoming a critical challenge for biomedical researchers.” Luckily, with this new collaboration with the Google Cloud Platform, storing and processing large amounts of data will no longer be as daunting a task.

“Through our collaboration with Broad Institute and our work with the Global Alliance for Genomics and Health and the life science community, we believe we can make a difference in improving human health,” the Google blog post reads. “By making it easier for researchers to ask big questions and find answers amid complexity, we hope to unleash scientific creativity that could significantly improve our understanding of health and disease.”

Researchers almost achieve Absolute Zero!

image

Thanks to Elsa’s freezing powers lasers and some advanced techniques, a team of MIT scientists has managed to freeze a molecule to 500 nanokelvins: a temp that’s nearly absolute zero. Not zero degrees Fahrenheit, but absolute zero, which is around -459.67 degrees F — a lot colder than the cold parts of space. See, in their natural state, molecules vibrate, rotate and generally move in a frantic pace like interns working for Miranda Priestly. By cooling them down to the point that they’re barely able to move, scientists can form previously unseen states of matter. According to MIT physics professor Martin Zwierlein: “…with ultracold molecules, you can get a huge variety of different states of matter, like superfluid crystals, which are crystalline, yet feel no friction, which is totally bizarre. This has not been observed so far, but predicted. We might not be far from seeing these effects, so we’re all excited.”

For this particular study, the scientists decided to freeze clouds of sodium and potassium using lasers and evaporative cooling. Then they glued individual atoms together to form sodium potassium (NaK) molecules, which were again subject to laser beams to effectively suck out 7,500 Kelvins of energy in all. To get those superfluid crystals Zwierlein mentioned, though, the scientists have to go even lower than 500 nanokelvins. They also have to experiment with other atoms and molecules, but this is definitely a promising start.

Source: MIT via Gizmodo

Changeable Textures On Demand

image

3D printing technology has certainly come a long way since its early days, and it is nice to see progress happening in the right direction. The creation of surfaces is not too difficult – in fact, it is rather simply to churn out smooth ones, or those that are bumpy. How about a “hybrid” surface? It seems to be very possible now thanks to researchers over at MIT who have successfully come up with one that can be both – depending on how you like it.

This 3D-printed surface which was developed, depending on the situation, can be either smooth, bumpy, ridged, or channeled, and even better yet is its ability to dynamically change texture thanks to the application of pressure. Is there any practical application in such a surface? Well, mutable surface textures are useful when it comes to preventing unwanted animals from growing on ship hulls, or perhaps to channel microscale amounts of fluids, or to use it in the camouflage creation process, as well as optics. Even more important, surfaces that are created by the MIT team happen to be produced using a 3D printer, which means it will be all the more easier to be creative with this particular technique.

The robotic cheetah jumps!

image

In a leap for robot development, the MIT researchers who built a robotic cheetah have now trained it to see and jump over hurdles as it runs — making this the first four-legged robot to run and jump over obstacles autonomously.

To get a running jump, the robot plans out its path, much like a human runner: As it detects an approaching obstacle, it estimates that object’s height and distance. The robot gauges the best position from which to jump, and adjusts its stride to land just short of the obstacle, before exerting enough force to push up and over. Based on the obstacle’s height, the robot then applies a certain amount of force to land safely, before resuming its initial pace.

In experiments on a treadmill and an indoor track, the cheetah robot successfully cleared obstacles up to 18 inches tall — more than half of the robot’s own height — while maintaining an average running speed of 5 miles per hour.

“A running jump is a truly dynamic behavior,” says Sangbae Kim, an assistant professor of mechanical engineering at MIT. “You have to manage balance and energy, and be able to handle impact after landing. Our robot is specifically designed for those highly dynamic behaviors.”

Kim and his colleagues — including research scientist Hae won Park and postdoc Patrick Wensing — will demonstrate their cheetah’s running jump at the DARPA Robotics Challenge in June, and will present a paper detailing the autonomous system in July at the conference Robotics: Science and Systems.

Source: http://newsoffice.mit.edu

Handling Big Data gets easier!

image

As anyone who’s ever used a spreadsheet can attest, it’s often convenient to organize data into tables. But in the age of big data, those tables can be enormous, with millions or even hundreds of millions of rows.

One way to make big-data analysis computationally practical is to reduce the size of data tables — or matrices, to use the mathematical term — by leaving out a bunch of rows. The trick is that the remaining rows have to be in some sense representative of the ones that were omitted, in order for computations performed on them to yield approximately the right results.

At the ACM Symposium on Theory of Computing in June, MIT researchers will present a new algorithm that finds the smallest possible approximation of the original matrix that guarantees reliable computations. For a class of problems important in engineering and machine learning, this is a significant improvement over previous techniques. And for all classes of problems, the algorithm finds the approximation as quickly as possible.

In order to determine how well a given row of the condensed matrix represents a row of the original matrix, the algorithm needs to measure the “distance” between them. But there are different ways to define “distance.”

One common way is so-called “Euclidean distance.” In Euclidean distance, the differences between the entries at corresponding positions in the two rows are squared and added together, and the distance between rows is the square root of the resulting sum. The intuition is that of the Pythagorean theorem: The square root of the sum of the squares of the lengths of a right triangle’s legs gives the length of the hypotenuse.

Another measure of distance is less common but particularly useful in solving machine-learning and other optimization problems. It’s called “Manhattan distance,” and it’s simply the sum of the absolute differences between the corresponding entries in the two rows.

Inside the norm

In fact, both Manhattan distance and Euclidean distance are instances of what statisticians call “norms.” The Manhattan distance, or 1-norm, is the first root of the sum of differences raised to the first power, and the Euclidean distance, or 2-norm, is the square root of the sum of differences raised to the second power. The 3-norm is the cube root of the sum of differences raised to the third power, and so on to infinity.

In their paper, the MIT researchers — Richard Peng, a postdoc in applied mathematics, and Michael Cohen, a graduate student in electrical engineering and computer science — demonstrate that their algorithm is optimal for condensing matrices under any norm. But according to Peng, “The one we really cared about was the 1-norm.”

In matrix condensation — under any norm — the first step is to assign each row of the original matrix a “weight.” A row’s weight represents the number of other rows that it’s similar to, and it determines the likelihood that the row will be included in the condensed matrix. If it is, its values will be multiplied according to its weight. So, for instance, if 10 rows are good stand-ins for each other, but not for any other rows of the matrix, each will have a 10 percent chance of getting into the condensed matrix. If one of them does, its entries will all be multiplied by 10, so that it will reflect the contribution of the other nine rows it’s standing in for.

Although Manhattan distance is in some sense simpler than Euclidean distance, it makes calculating rows’ weights more difficult. Previously, the best algorithm for condensing matrices under the 1-norm would yield a matrix whose number of rows was proportional to the number of columns of the original matrix raised to the power of 2.5. The best algorithm for condensing matrices under the 2-norm, however, would yield a matrix whose number of rows was proportional to the number of columns of the original matrix times its own logarithm.

That means that if the matrix had 100 columns, under the 1-norm, the best possible condensation, before Peng and Cohen’s work, was a matrix with hundreds of thousands of rows. Under the 2-norm, it was a matrix with a couple of hundred rows. That discrepancy grows as the number of columns increases.

Taming recursion

Peng and Cohen’s algorithm condenses matrices under the 1-norm as well as it does under the 2-norm; under the 2-norm, it condenses matrices as well as its predecessors do. That’s because, for the 2-norm, it simply uses the best existing algorithm. For the 1-norm, it uses the same algorithm, but it uses it five or six times.

The paper’s real contribution is to mathematically prove that the 2-norm algorithm will yield reliable results under the 1-norm. As Peng explains, an equation for calculating 1-norm weights has been known for some time. But “the funny thing with that definition is that it’s recursive,” he says. “So the correct set of weights appears on both the left-hand side and the right-hand side.” That is, the weight for a given matrix row — call it w — is set equal to a mathematical expression that itself includes w.

“This definition was known to exist, but people in stats didn’t know what to do with it,” Peng says. “They look at it and think, ‘How do I ever compute anything with this?’”

What Peng and Cohen prove is that if you start by setting thew on the right side of the equation equal to 1, then evaluate the expression and plug the answer back into the right-handw, then do the same thing again, and again, you’ll quickly converge on a good approximation of the correct value of w.

“It’s highly elegant mathematics, and it gives a significant advance over previous results,” says Richard Karp, a professor of computer science at the University of California at Berkeley and a winner of the National Medal of Science and of the Turing Award, the highest honor in computer science. “It boils the original problem down to a very simple-to-understand one. I admire the mathematical development that went into it.”

Copyright © news.mit.edu

Best Sites to Learn Coding Online

There’s no reason why shouldn’t know the basics of coding. You can automate tasks, you can program your Excel sheets, improve workflows, you can extract data from websites and accomplish so much more with code. You may not be in the business of writing software programs but knowing the basics of coding will help you communicate more effectively with developers.

Gone are the days when you had to enroll in expensive computer training classes as now exist a plethora of web-based courses that will help you learn programming at your own pace in the comfort of your web browser.

The Best Sites to Learn Programming

If you are ready to take the plunge, here are some of the best websites that offer courses in a variety of programming languages for free. I have also added a list of companion ebooks that will give you a more in-depth understanding of the language and they don’t cost anything either.

Online Courses & ScreencastsProgramming Books (Free)JavaScriptCode Academy,Learn Street,Code Combat,Code AvengersEloquent JavaScript,JavaScript Guide,Speaking JSJS The Right Way,Oh My JS,CanvassingHTML & CSSCode Academy,Don’t Fear The Internet,Tutsplus,Learn LayoutA to Z CSSDash,Web Accessibility,The Hello WorldKhan Academy,HTML5 from ScratchMozillaDive into HTML520 Things I Learned,HTML Dog,HTML & CSS,HTML5 for Designers,DOM Enlightenment,HTML CanvasjQueryCode Academy,Tutsplus,Code SchooljQuery Fundamentals,Learn jQueryPythonCode Academy,Google,Learn Street,Python Tutor,IHeartPYPython for You and Me,  Dive into Python,Learn Python the Hard Way,Think Python,Python for FunTango with Django,DjangoRuby & Ruby on RailsCode Academy,TryRubyCode Learn,Railscasts,Rubymonk,Learn StreetWhy’s (Poignant) Guide to Ruby,Learn Ruby the Hard Way,Learn to Program,Learn Rails by ExamplePHPCode AcademyPHP Programming,Practical PHPAlso see: How to Learn Regular Expressions (RegEx)Google Apps ScriptGetting StartedOffice Hours,Google Scripts Examples,Learning Apps ScriptWordPressTreehouseWordPress TVLinux & Shell ScriptingStanford.edu,Explain ShellConquer the Command LineNode.jsNodetuts,Node SchoolThe Node Beginner Book,Mixu’s Node bookNode Up and Running,Mastering Node.jsAngular JSCode School,Egg Head,Learn AngularAngular JS Tutorial,Thinking Angular,Angular Tutorial,Getting Started(Adobe)Also see: Learn Touch Typing & Code FasterGit (version control)Code School,Git Immersion,GitHub Training,UdacityPro GitLearn GitGists in GithubObjective-C (iOS & Mac)Code SchoolStanfordiTunesUChrome Dev ToolsCode SchoolDev Tools Secret,Chrome Dev Tools Tutorial,UdacityBuilding Browser AppsGo LanguageGolang.org,GopherCastsProgramming in GoGo by Example,Learning Go,Building Web Apps with Go,Learning GoJavaLearn Java,Coding Bat,Java Udemy,LearnerooProgramming in Java,Thinking in JavaO’Reilly Learning Java,Think Java,Java & CSJava for Python DevsAndroid App DevelopmentUdacity (Google Developers),CourseraThe New Boston,Google UniversityApp Development EssentialsCode LearnApp Inventor (Visual)D3 (data visualization)Data Visualization for the Web,Dashing D3D3 Tips & TricksAlso see: Learn VIM, the text editor for programmersSQL (Databases)SQL ZooSQL @Stanford,Essential SQLSQL for Nerds,Intro to SQLSQL BoltPHP & MySQLEverything ElseUdacityedX.orgCoursera,Udemy$Lynda$Pluralsight$,Treehouse$Open Consortium,One Month Rails$

Teaching Kids to Code

If there are kids in the family, you should download either Tynker (Android/iOS) or theHopscotch app for iPad and they can learn the basics of programming through games and puzzles.

There’s also Scratch, an MIT project that allows kids to program their own stories and games visually. Scratch is available as a web app or you can download it on your Mac/Windows/Linux computer for offline use. Microsoft TouchDevelopBlockly and Alice are some other web apps that will introduce the concepts of computer progamming to your children.