Whereas PathNet permanently freezes parameters and pathways used for previously learned tasks, in this case the authors compute how important each connection is to the most recently learned task, and protect each connection from future modification by an amount proportional to its importance. Important pathways tend to persist, and unimportant pathways tend to be discarded, gradually freeing "underused" connections for learning new tasks.
The authors call this process Elastic Weight Consolidation (EWC). Figure 1 in the paper does a great job of explaining how EWC finds solutions in the search space of solutions that are good for new tasks without incurring significant losses for previous tasks.
that's pretty incredible - I'm no AI expert but it certainly sounds like this algorithm provides an ANN equivalent of neuroplasticity, which seems like a big step.
I'm confused. I don't get what the novelty in this is. It looks like all they do is include an input that identifies different tasks and then trains one neural network to learn a separate distributions for each task, with some weight sharing...
Of course, people have done this before [1]. There is quite of bit of research looking into Multi-task learning. Just look through some of the references in that Luong et all paper. Deepmind has been putting out some amazing research lately, but this paper definitely does not fall in that category.
Less silly than it looks at first sight. After all, for day-to-day use (except for GPS I guess, and most people probably don't even realize it would not work without a correction for relativistic effects) relativity is very little gain over classical Newtonian physics and a lot more complex to work out from a mathematical point of view.
So even though 'that's how it really should work' we tend to take the shortcut because it is 'good enough' for almost all use cases.
Which caused us to miss the wood for the trees for a long time. This minor change is what enables learning in the first place, and as such it could easily be a game changer.
Research engineers turn theory, pseudocode, or smaller proof of concepts into a more fleshed out implementation. Once a research project exceeds a few thousand lines of code it becomes useful to have dedicated engineers doing architectural design, owning unit testing / backtesting frameworks, code quality control, etc.
Source: was a research engineer at Intel Labs several years ago.
While this is true, it is also sometimes merely based on your degree. I was a "Research Engineer" doing the same work as "Research Scientists" because my degree was in "Computer Engineering" not "Computer Science."
Whereas PathNet permanently freezes parameters and pathways used for previously learned tasks, in this case the authors compute how important each connection is to the most recently learned task, and protect each connection from future modification by an amount proportional to its importance. Important pathways tend to persist, and unimportant pathways tend to be discarded, gradually freeing "underused" connections for learning new tasks.
The authors call this process Elastic Weight Consolidation (EWC). Figure 1 in the paper does a great job of explaining how EWC finds solutions in the search space of solutions that are good for new tasks without incurring significant losses for previous tasks.
Very cool!