diff options
Diffstat (limited to 'site')
-rw-r--r-- | site/blog-feed.xml | 4 | ||||
-rw-r--r-- | site/blog.html | 4 | ||||
-rw-r--r-- | site/index.html | 4 | ||||
-rw-r--r-- | site/microblog-feed.xml | 156 | ||||
-rw-r--r-- | site/microblog.html | 92 | ||||
-rw-r--r-- | site/posts/2018-04-10-update-open-research.html | 4 |
6 files changed, 253 insertions, 11 deletions
diff --git a/site/blog-feed.xml b/site/blog-feed.xml index 26b4946..fac8ce3 100644 --- a/site/blog-feed.xml +++ b/site/blog-feed.xml @@ -19,8 +19,8 @@ </author> <content type="html"><p>It has been 9 months since I last wrote about open (maths) research. Since then two things happened which prompted me to write an update.</p> <p>As always I discuss open research only in mathematics, not because I think it should not be applied to other disciplines, but simply because I do not have experience nor sufficient interests in non-mathematical subjects.</p> -<p>First, I read about Richard Stallman the founder of the free software movement, in <a href="http://shop.oreilly.com/product/9780596002879.do">his biography by Sam Williams</a> and his own collection of essays <a href="https://shop.fsf.org/books-docs/free-software-free-society-selected-essays-richard-m-stallman-3rd-edition"><em>Free software, free society</em></a>, from which I learned a bit more about the context and philosophy of free software and open source software. For anyone interested in open research, I highly recommend having a look at these two books. I am also reading Levy’s <a href="http://www.stevenlevy.com/index.php/books/hackers">Hackers</a>, which documented the development of the hacker culture predating Stallman. I can see the connection of ideas from the hacker ethic to free software to the open source philosophy. My guess is that the software world is fortunate to have pioneers who advocated for freedom and openness from the beginning, whereas for academia which has a much longer history, credit protection has always been a bigger concern.</p> -<p>Also a month ago I attended a workshop called <a href="https://www.perimeterinstitute.ca/conferences/open-research-rethinking-scientific-collaboration">Open research: rethinking scientific collaboration</a>. That was the first time I met a group of people (mostly physicists) who also want open research to happen, and we had some stimulating discussions.</p> +<p>First, I read about Richard Stallman the founder of the free software movement, in <a href="http://shop.oreilly.com/product/9780596002879.do">his biography by Sam Williams</a> and his own collection of essays <a href="https://shop.fsf.org/books-docs/free-software-free-society-selected-essays-richard-m-stallman-3rd-edition"><em>Free software, free society</em></a>, from which I learned a bit more about the context and philosophy of free software and its relation to that of open source software. For anyone interested in open research, I highly recommend having a look at these two books. I am also reading Levy’s <a href="http://www.stevenlevy.com/index.php/books/hackers">Hackers</a>, which documented the development of the hacker culture predating Stallman. I can see the connection of ideas from the hacker ethic to the free software philosophy and to the open source philosophy. My guess is that the software world is fortunate to have pioneers who advocated for various kinds of freedom and openness from the beginning, whereas for academia which has a much longer history, credit protection has always been a bigger concern.</p> +<p>Also a month ago I attended a workshop called <a href="https://www.perimeterinstitute.ca/conferences/open-research-rethinking-scientific-collaboration">Open research: rethinking scientific collaboration</a>. That was the first time I met a group of people (mostly physicists) who also want open research to happen, and we had some stimulating discussions. Many thanks to the organisers at Perimeter Institute for organising the event, and special thanks to <a href="https://www.perimeterinstitute.ca/people/matteo-smerlak">Matteo Smerlak</a> and <a href="https://www.perimeterinstitute.ca/people/ashley-milsted">Ashley Milsted</a> for invitation and hosting.</p> <p>From both of these I feel like I should write an updated post on open research.</p> <h3 id="freedom-and-community">Freedom and community</h3> <p>Ideals matter. Stallman’s struggles stemmed from the frustration of denied request of source code (a frustration I shared in academia except source code is replaced by maths knowledge), and revolved around two things that underlie the free software movement: freedom and community. That is, the freedom to use, modify and share a work, and by sharing, to help the community.</p> diff --git a/site/blog.html b/site/blog.html index 34a2f8c..66e965b 100644 --- a/site/blog.html +++ b/site/blog.html @@ -23,8 +23,8 @@ <p>Posted on 2018-04-29</p> <p>It has been 9 months since I last wrote about open (maths) research. Since then two things happened which prompted me to write an update.</p> <p>As always I discuss open research only in mathematics, not because I think it should not be applied to other disciplines, but simply because I do not have experience nor sufficient interests in non-mathematical subjects.</p> -<p>First, I read about Richard Stallman the founder of the free software movement, in <a href="http://shop.oreilly.com/product/9780596002879.do">his biography by Sam Williams</a> and his own collection of essays <a href="https://shop.fsf.org/books-docs/free-software-free-society-selected-essays-richard-m-stallman-3rd-edition"><em>Free software, free society</em></a>, from which I learned a bit more about the context and philosophy of free software and open source software. For anyone interested in open research, I highly recommend having a look at these two books. I am also reading Levy’s <a href="http://www.stevenlevy.com/index.php/books/hackers">Hackers</a>, which documented the development of the hacker culture predating Stallman. I can see the connection of ideas from the hacker ethic to free software to the open source philosophy. My guess is that the software world is fortunate to have pioneers who advocated for freedom and openness from the beginning, whereas for academia which has a much longer history, credit protection has always been a bigger concern.</p> -<p>Also a month ago I attended a workshop called <a href="https://www.perimeterinstitute.ca/conferences/open-research-rethinking-scientific-collaboration">Open research: rethinking scientific collaboration</a>. That was the first time I met a group of people (mostly physicists) who also want open research to happen, and we had some stimulating discussions.</p> +<p>First, I read about Richard Stallman the founder of the free software movement, in <a href="http://shop.oreilly.com/product/9780596002879.do">his biography by Sam Williams</a> and his own collection of essays <a href="https://shop.fsf.org/books-docs/free-software-free-society-selected-essays-richard-m-stallman-3rd-edition"><em>Free software, free society</em></a>, from which I learned a bit more about the context and philosophy of free software and its relation to that of open source software. For anyone interested in open research, I highly recommend having a look at these two books. I am also reading Levy’s <a href="http://www.stevenlevy.com/index.php/books/hackers">Hackers</a>, which documented the development of the hacker culture predating Stallman. I can see the connection of ideas from the hacker ethic to the free software philosophy and to the open source philosophy. My guess is that the software world is fortunate to have pioneers who advocated for various kinds of freedom and openness from the beginning, whereas for academia which has a much longer history, credit protection has always been a bigger concern.</p> +<p>Also a month ago I attended a workshop called <a href="https://www.perimeterinstitute.ca/conferences/open-research-rethinking-scientific-collaboration">Open research: rethinking scientific collaboration</a>. That was the first time I met a group of people (mostly physicists) who also want open research to happen, and we had some stimulating discussions. Many thanks to the organisers at Perimeter Institute for organising the event, and special thanks to <a href="https://www.perimeterinstitute.ca/people/matteo-smerlak">Matteo Smerlak</a> and <a href="https://www.perimeterinstitute.ca/people/ashley-milsted">Ashley Milsted</a> for invitation and hosting.</p> <p>From both of these I feel like I should write an updated post on open research.</p> <h3 id="freedom-and-community">Freedom and community</h3> <p>Ideals matter. Stallman’s struggles stemmed from the frustration of denied request of source code (a frustration I shared in academia except source code is replaced by maths knowledge), and revolved around two things that underlie the free software movement: freedom and community. That is, the freedom to use, modify and share a work, and by sharing, to help the community.</p> diff --git a/site/index.html b/site/index.html index 8f85f67..a167ab1 100644 --- a/site/index.html +++ b/site/index.html @@ -19,10 +19,10 @@ <div class="main"> <div class="bodyitem"> - <p>Yuchen is a postdoctoral researcher in the <a href="https://www.math.kth.se/RMSMA/">KTH RMSMA group</a>. Before KTH he did a PhD in the <a href="https://warwick.ac.uk/fac/sci/masdoc">MASDOC program at Warwick</a>, and spent two years in a postdoc position in <a href="http://cmsa.fas.harvard.edu">CMSA at Harvard</a>.</p> + <p>Yuchen is a post-doctoral researcher in mathematics at the <a href="https://www.math.kth.se/RMSMA/">KTH RMSMA group</a>. Before KTH he did a PhD at the <a href="https://warwick.ac.uk/fac/sci/masdoc">MASDOC program at Warwick</a>, and spent two years in a postdoc position at <a href="http://cmsa.fas.harvard.edu">CMSA at Harvard</a>.</p> <p>He is interested in machine learning and functional programming.</p> <p>He is also interested in the idea of open research and open sourced his research in Robinson-Schensted algorithms as a <a href="https://toywiki.xyz">wiki</a>.</p> -<p>He can be reached at: hi@ypei.me | <a href="https://github.com/ycpei">Github</a> | <a href="https://www.linkedin.com/in/ycpei/">LinkedIn</a></p> +<p>He can be reached at: ypei@kth.se | hi@ypei.me | <a href="https://github.com/ycpei">Github</a> | <a href="https://www.linkedin.com/in/ycpei/">LinkedIn</a></p> <p>This website is made using a <a href="https://github.com/ycpei/ypei.me/blob/master/engine/engine.py">handmade static site generator</a>.</p> <p>Unless otherwise specified, all contents on this website are licensed under <a href="https://creativecommons.org/licenses/by-nd/4.0/">Creative Commons Attribution-NoDerivatives 4.0 International License</a>.</p> diff --git a/site/microblog-feed.xml b/site/microblog-feed.xml index 79d1dbc..1bc1634 100644 --- a/site/microblog-feed.xml +++ b/site/microblog-feed.xml @@ -2,7 +2,7 @@ <feed xmlns="http://www.w3.org/2005/Atom"> <title type="text">Yuchen Pei's Microblog</title> <id>https://ypei.me/microblog-feed.xml</id> - <updated>2018-04-06T00:00:00Z</updated> + <updated>2018-05-11T00:00:00Z</updated> <link href="https://ypei.me" /> <link href="https://ypei.me/microblog-feed.xml" rel="self" /> <author> @@ -10,6 +10,160 @@ </author> <generator>PyAtom</generator> <entry xml:base="https://ypei.me/microblog-feed.xml"> + <title type="text">2018-05-11</title> + <id>microblog.html</id> + <updated>2018-05-11T00:00:00Z</updated> + <link href="microblog.html" /> + <author> + <name>Yuchen Pei</name> + </author> + <content type="html"><h3 id="some-notes-on-rnn-fsm-fa-tm-and-utm">Some notes on RNN, FSM / FA, TM and UTM</h3> +<p>Related to a previous micropost.</p> +<p><a href="http://www.cs.toronto.edu/~rgrosse/csc321/lec9.pdf">The slides from Toronto</a> is a nice introduction to RNN (recurrent neural network) from a computational point of view. It states that RNN can simulate any FSM (finite state machine, a.k.a. finite automata abbr. FA) with a toy example computing the parity of a binary string.</p> +<p><a href="http://www.deeplearningbook.org/contents/rnn.html">Goodfellow et. al.’s book</a> (see page 372 and 374) goes one step further, stating that RNN with a hidden-to-hidden layer can simulate Turing machines, and not only that, but also the <em>universal</em> Turing machine abbr. UTM (the book referenced <a href="https://www.sciencedirect.com/science/article/pii/S0022000085710136">Siegelmann-Sontag</a>), a property not shared by the weaker network where the hidden-to-hidden layer is replaced by an output-to-hidden layer (page 376).</p> +<p>By the way, the RNN with a hidden-to-hidden layer has the same architecture as the so-called linear dynamical system mentioned in <a href="https://www.coursera.org/learn/neural-networks/lecture/Fpa7y/modeling-sequences-a-brief-overview">Hinton’s video</a>.</p> +<p>From what I have learned, the universality of RNN and feedforward networks are therefore due to different arguments, the former coming from Turing machines and the latter from an analytical view of approximation by step functions.</p> +</content> + </entry> + <entry xml:base="https://ypei.me/microblog-feed.xml"> + <title type="text">2018-05-10</title> + <id>microblog.html</id> + <updated>2018-05-10T00:00:00Z</updated> + <link href="microblog.html" /> + <author> + <name>Yuchen Pei</name> + </author> + <content type="html"><h3 id="writing-readable-mathematics-like-writing-an-operating-system">Writing readable mathematics like writing an operating system</h3> +<p>One way to write readable mathematics is to decouple concepts. One idea is the following template. First write a toy example with all the important components present in this example, then analyse each component individually and elaborate how (perhaps more complex) variations of the component can extend the toy example and induce more complex or powerful versions of the toy example. Through such incremental development, one should be able to arrive at any result in cutting edge research after a pleasant journey.</p> +<p>It’s a bit like the UNIX philosophy, where you have a basic system of modules like IO, memory management, graphics etc, and modify / improve each module individually (H/t <a href="http://nand2tetris.org/">NAND2Tetris</a>).</p> +<p>The book <a href="http://neuralnetworksanddeeplearning.com/">Neutral networks and deep learning</a> by Michael Nielsen is an example of such approach. It begins the journey with a very simple neutral net with one hidden layer, no regularisation, and sigmoid activations. It then analyses each component including cost functions, the back propagation algorithm, the activation functions, regularisation and the overall architecture (from fully connected to CNN) individually and improve the toy example incrementally. Over the course the accuracy of the example of mnist grows incrementally from 95.42% to 99.67%.</p> +</content> + </entry> + <entry xml:base="https://ypei.me/microblog-feed.xml"> + <title type="text">2018-05-09</title> + <id>microblog.html</id> + <updated>2018-05-09T00:00:00Z</updated> + <link href="microblog.html" /> + <author> + <name>Yuchen Pei</name> + </author> + <content type="html"><blockquote> +<p>What makes the rectified linear activation function better than the sigmoid or tanh functions? At present, we have a poor understanding of the answer to this question. Indeed, rectified linear units have only begun to be widely used in the past few years. The reason for that recent adoption is empirical: a few people tried rectified linear units, often on the basis of hunches or heuristic arguments. They got good results classifying benchmark data sets, and the practice has spread. In an ideal world we’d have a theory telling us which activation function to pick for which application. But at present we’re a long way from such a world. I should not be at all surprised if further major improvements can be obtained by an even better choice of activation function. And I also expect that in coming decades a powerful theory of activation functions will be developed. Today, we still have to rely on poorly understood rules of thumb and experience.</p> +</blockquote> +<p>Michael Nielsen, <a href="http://neuralnetworksanddeeplearning.com/chap6.html#convolutional_neural_networks_in_practice">Neutral networks and deep learning</a></p> +</content> + </entry> + <entry xml:base="https://ypei.me/microblog-feed.xml"> + <title type="text">2018-05-09</title> + <id>microblog.html</id> + <updated>2018-05-09T00:00:00Z</updated> + <link href="microblog.html" /> + <author> + <name>Yuchen Pei</name> + </author> + <content type="html"><blockquote> +<p>One way RNNs are currently being used is to connect neural networks more closely to traditional ways of thinking about algorithms, ways of thinking based on concepts such as Turing machines and (conventional) programming languages. <a href="https://arxiv.org/abs/1410.4615">A 2014 paper</a> developed an RNN which could take as input a character-by-character description of a (very, very simple!) Python program, and use that description to predict the output. Informally, the network is learning to “understand” certain Python programs. <a href="https://arxiv.org/abs/1410.5401">A second paper, also from 2014</a>, used RNNs as a starting point to develop what they called a neural Turing machine (NTM). This is a universal computer whose entire structure can be trained using gradient descent. They trained their NTM to infer algorithms for several simple problems, such as sorting and copying.</p> +<p>As it stands, these are extremely simple toy models. Learning to execute the Python program <code>print(398345+42598)</code> doesn’t make a network into a full-fledged Python interpreter! It’s not clear how much further it will be possible to push the ideas. Still, the results are intriguing. Historically, neural networks have done well at pattern recognition problems where conventional algorithmic approaches have trouble. Vice versa, conventional algorithmic approaches are good at solving problems that neural nets aren’t so good at. No-one today implements a web server or a database program using a neural network! It’d be great to develop unified models that integrate the strengths of both neural networks and more traditional approaches to algorithms. RNNs and ideas inspired by RNNs may help us do that.</p> +</blockquote> +<p>Michael Nielsen, <a href="http://neuralnetworksanddeeplearning.com/chap6.html#other_approaches_to_deep_neural_nets">Neural networks and deep learning</a></p> +</content> + </entry> + <entry xml:base="https://ypei.me/microblog-feed.xml"> + <title type="text">2018-05-08</title> + <id>microblog.html</id> + <updated>2018-05-08T00:00:00Z</updated> + <link href="microblog.html" /> + <author> + <name>Yuchen Pei</name> + </author> + <content type="html"><p>Primer Science is a tool by a startup called Primer that uses NLP to summarize contents (but not single papers, yet) on arxiv. A developer of this tool predicts in <a href="https://twimlai.com/twiml-talk-136-taming-arxiv-w-natural-language-processing-with-john-bohannon/#">an interview</a> that progress on AI’s ability to extract meanings from AI research papers will be the biggest accelerant on AI research.</p> +</content> + </entry> + <entry xml:base="https://ypei.me/microblog-feed.xml"> + <title type="text">2018-05-08</title> + <id>microblog.html</id> + <updated>2018-05-08T00:00:00Z</updated> + <link href="microblog.html" /> + <author> + <name>Yuchen Pei</name> + </author> + <content type="html"><blockquote> +<p>no-one has yet developed an entirely convincing theoretical explanation for why regularization helps networks generalize. Indeed, researchers continue to write papers where they try different approaches to regularization, compare them to see which works better, and attempt to understand why different approaches work better or worse. And so you can view regularization as something of a kludge. While it often helps, we don’t have an entirely satisfactory systematic understanding of what’s going on, merely incomplete heuristics and rules of thumb.</p> +<p>There’s a deeper set of issues here, issues which go to the heart of science. It’s the question of how we generalize. Regularization may give us a computational magic wand that helps our networks generalize better, but it doesn’t give us a principled understanding of how generalization works, nor of what the best approach is.</p> +</blockquote> +<p>Michael Nielsen, <a href="http://neuralnetworksanddeeplearning.com/chap3.html#why_does_regularization_help_reduce_overfitting">Neural networks and deep learning</a></p> +</content> + </entry> + <entry xml:base="https://ypei.me/microblog-feed.xml"> + <title type="text">2018-05-08</title> + <id>microblog.html</id> + <updated>2018-05-08T00:00:00Z</updated> + <link href="microblog.html" /> + <author> + <name>Yuchen Pei</name> + </author> + <content type="html"><p>Computerphile has some brilliant educational videos on computer science, like <a href="https://www.youtube.com/watch?v=ciNHn38EyRc">a demo of SQL injection</a>, <a href="https://www.youtube.com/watch?v=eis11j_iGMs">a toy example of the lambda calculus</a>, and <a href="https://www.youtube.com/watch?v=9T8A89jgeTI">explaining the Y combinator</a>.</p> +</content> + </entry> + <entry xml:base="https://ypei.me/microblog-feed.xml"> + <title type="text">2018-05-07</title> + <id>microblog.html</id> + <updated>2018-05-07T00:00:00Z</updated> + <link href="microblog.html" /> + <author> + <name>Yuchen Pei</name> + </author> + <content type="html"><h3 id="learning-via-knowledge-graph-and-reddit-journal-clubs">Learning via knowledge graph and reddit journal clubs</h3> +<p>It is a natural idea to look for ways to learn things like going through a skill tree in a computer RPG.</p> +<p>For example I made a <a href="https://ypei.me/posts/2015-04-02-juggling-skill-tree.html">DAG for juggling</a>.</p> +<p>Websites like <a href="https://knowen.org">Knowen</a> and <a href="https://metacademy.org">Metacademy</a> explore this idea with added flavour of open collaboration.</p> +<p>The design of Metacademy looks quite promising. It also has a nice tagline: “your package manager for knowledge”.</p> +<p>There are so so many tools to assist learning / research / knowledge sharing today, and we should keep experimenting, in the hope that eventually one of them will scale.</p> +<p>On another note, I often complain about the lack of a place to discuss math research online, but today I found on Reddit some journal clubs on machine learning: <a href="https://www.reddit.com/r/MachineLearning/comments/8aluhs/d_machine_learning_wayr_what_are_you_reading_week/">1</a>, <a href="https://www.reddit.com/r/MachineLearning/comments/8elmd8/d_anyone_having_trouble_reading_a_particular/">2</a>. If only we had this for maths. On the other hand r/math does have some interesting recurring threads as well: <a href="https://www.reddit.com/r/math/wiki/everythingaboutx">Everything about X</a> and <a href="https://www.reddit.com/r/math/search?q=what+are+you+working+on?+author:automoderator+&amp;sort=new&amp;restrict_sr=on&amp;t=all">What Are You Working On?</a>. Hopefully these threads can last for years to come.</p> +</content> + </entry> + <entry xml:base="https://ypei.me/microblog-feed.xml"> + <title type="text">2018-05-02</title> + <id>microblog.html</id> + <updated>2018-05-02T00:00:00Z</updated> + <link href="microblog.html" /> + <author> + <name>Yuchen Pei</name> + </author> + <content type="html"><h3 id="pastebin-for-the-win">Pastebin for the win</h3> +<p>The lack of maths rendering in major online communication platforms like instant messaging, email or Github has been a minor obsession of mine for quite a while, as I saw it as a big factor preventing people from talking more maths online. But today I realised this is totally a non-issue. Just do what people on IRC have been doing since the inception of the universe: use a (latex) pastebin.</p> +</content> + </entry> + <entry xml:base="https://ypei.me/microblog-feed.xml"> + <title type="text">2018-05-01</title> + <id>microblog.html</id> + <updated>2018-05-01T00:00:00Z</updated> + <link href="microblog.html" /> + <author> + <name>Yuchen Pei</name> + </author> + <content type="html"><blockquote> +<p>Neural networks are one of the most beautiful programming paradigms ever invented. In the conventional approach to programming, we tell the computer what to do, breaking big problems up into many small, precisely defined tasks that the computer can easily perform. By contrast, in a neural network we don’t tell the computer how to solve our problem. Instead, it learns from observational data, figuring out its own solution to the problem at hand.</p> +</blockquote> +<p>Michael Nielsen - <a href="http://neuralnetworksanddeeplearning.com/about.html">What this book (Neural Networks and Deep Learning) is about</a></p> +<p>Unrelated to the quote, note that Nielsen’s book is licensed under <a href="https://creativecommons.org/licenses/by-nc/3.0/deed.en_GB">CC BY-NC</a>, so one can build on it and redistribute non-commercially.</p> +</content> + </entry> + <entry xml:base="https://ypei.me/microblog-feed.xml"> + <title type="text">2018-04-30</title> + <id>microblog.html</id> + <updated>2018-04-30T00:00:00Z</updated> + <link href="microblog.html" /> + <author> + <name>Yuchen Pei</name> + </author> + <content type="html"><blockquote> +<p>But, users have learned to accommodate to Google not the other way around. We know what kinds of things we can type into Google and what we can’t and we keep our searches to things that Google is likely to help with. We know we are looking for texts and not answers to start a conversation with an entity that knows what we really need to talk about. People learn from conversation and Google can’t have one. It can pretend to have one using Siri but really those conversations tend to get tiresome when you are past asking about where to eat.</p> +</blockquote> +<p>Roger Schank - <a href="http://www.rogerschank.com/fraudulent-claims-made-by-IBM-about-Watson-and-AI">Fraudulent claims made by IBM about Watson and AI</a></p> +</content> + </entry> + <entry xml:base="https://ypei.me/microblog-feed.xml"> <title type="text">2018-04-06</title> <id>microblog.html</id> <updated>2018-04-06T00:00:00Z</updated> diff --git a/site/microblog.html b/site/microblog.html index 3297a40..8d3ba5a 100644 --- a/site/microblog.html +++ b/site/microblog.html @@ -19,7 +19,95 @@ <div class="main"> <div class="bodyitem"> - <p>2018-04-06</p> + <span id=rnn-fsm><p><a href="#rnn-fsm">2018-05-11</a></p></span> + <h3 id="some-notes-on-rnn-fsm-fa-tm-and-utm">Some notes on RNN, FSM / FA, TM and UTM</h3> +<p>Related to a previous micropost.</p> +<p><a href="http://www.cs.toronto.edu/~rgrosse/csc321/lec9.pdf">The slides from Toronto</a> is a nice introduction to RNN (recurrent neural network) from a computational point of view. It states that RNN can simulate any FSM (finite state machine, a.k.a. finite automata abbr. FA) with a toy example computing the parity of a binary string.</p> +<p><a href="http://www.deeplearningbook.org/contents/rnn.html">Goodfellow et. al.’s book</a> (see page 372 and 374) goes one step further, stating that RNN with a hidden-to-hidden layer can simulate Turing machines, and not only that, but also the <em>universal</em> Turing machine abbr. UTM (the book referenced <a href="https://www.sciencedirect.com/science/article/pii/S0022000085710136">Siegelmann-Sontag</a>), a property not shared by the weaker network where the hidden-to-hidden layer is replaced by an output-to-hidden layer (page 376).</p> +<p>By the way, the RNN with a hidden-to-hidden layer has the same architecture as the so-called linear dynamical system mentioned in <a href="https://www.coursera.org/learn/neural-networks/lecture/Fpa7y/modeling-sequences-a-brief-overview">Hinton’s video</a>.</p> +<p>From what I have learned, the universality of RNN and feedforward networks are therefore due to different arguments, the former coming from Turing machines and the latter from an analytical view of approximation by step functions.</p> + +</div> +<div class="bodyitem"> + <span id=math-writing-decoupling><p><a href="#math-writing-decoupling">2018-05-10</a></p></span> + <h3 id="writing-readable-mathematics-like-writing-an-operating-system">Writing readable mathematics like writing an operating system</h3> +<p>One way to write readable mathematics is to decouple concepts. One idea is the following template. First write a toy example with all the important components present in this example, then analyse each component individually and elaborate how (perhaps more complex) variations of the component can extend the toy example and induce more complex or powerful versions of the toy example. Through such incremental development, one should be able to arrive at any result in cutting edge research after a pleasant journey.</p> +<p>It’s a bit like the UNIX philosophy, where you have a basic system of modules like IO, memory management, graphics etc, and modify / improve each module individually (H/t <a href="http://nand2tetris.org/">NAND2Tetris</a>).</p> +<p>The book <a href="http://neuralnetworksanddeeplearning.com/">Neutral networks and deep learning</a> by Michael Nielsen is an example of such approach. It begins the journey with a very simple neutral net with one hidden layer, no regularisation, and sigmoid activations. It then analyses each component including cost functions, the back propagation algorithm, the activation functions, regularisation and the overall architecture (from fully connected to CNN) individually and improve the toy example incrementally. Over the course the accuracy of the example of mnist grows incrementally from 95.42% to 99.67%.</p> + +</div> +<div class="bodyitem"> + <span id=neural-nets-activation><p><a href="#neural-nets-activation">2018-05-09</a></p></span> + <blockquote> +<p>What makes the rectified linear activation function better than the sigmoid or tanh functions? At present, we have a poor understanding of the answer to this question. Indeed, rectified linear units have only begun to be widely used in the past few years. The reason for that recent adoption is empirical: a few people tried rectified linear units, often on the basis of hunches or heuristic arguments. They got good results classifying benchmark data sets, and the practice has spread. In an ideal world we’d have a theory telling us which activation function to pick for which application. But at present we’re a long way from such a world. I should not be at all surprised if further major improvements can be obtained by an even better choice of activation function. And I also expect that in coming decades a powerful theory of activation functions will be developed. Today, we still have to rely on poorly understood rules of thumb and experience.</p> +</blockquote> +<p>Michael Nielsen, <a href="http://neuralnetworksanddeeplearning.com/chap6.html#convolutional_neural_networks_in_practice">Neutral networks and deep learning</a></p> + +</div> +<div class="bodyitem"> + <span id=neural-turing-machine><p><a href="#neural-turing-machine">2018-05-09</a></p></span> + <blockquote> +<p>One way RNNs are currently being used is to connect neural networks more closely to traditional ways of thinking about algorithms, ways of thinking based on concepts such as Turing machines and (conventional) programming languages. <a href="https://arxiv.org/abs/1410.4615">A 2014 paper</a> developed an RNN which could take as input a character-by-character description of a (very, very simple!) Python program, and use that description to predict the output. Informally, the network is learning to “understand” certain Python programs. <a href="https://arxiv.org/abs/1410.5401">A second paper, also from 2014</a>, used RNNs as a starting point to develop what they called a neural Turing machine (NTM). This is a universal computer whose entire structure can be trained using gradient descent. They trained their NTM to infer algorithms for several simple problems, such as sorting and copying.</p> +<p>As it stands, these are extremely simple toy models. Learning to execute the Python program <code>print(398345+42598)</code> doesn’t make a network into a full-fledged Python interpreter! It’s not clear how much further it will be possible to push the ideas. Still, the results are intriguing. Historically, neural networks have done well at pattern recognition problems where conventional algorithmic approaches have trouble. Vice versa, conventional algorithmic approaches are good at solving problems that neural nets aren’t so good at. No-one today implements a web server or a database program using a neural network! It’d be great to develop unified models that integrate the strengths of both neural networks and more traditional approaches to algorithms. RNNs and ideas inspired by RNNs may help us do that.</p> +</blockquote> +<p>Michael Nielsen, <a href="http://neuralnetworksanddeeplearning.com/chap6.html#other_approaches_to_deep_neural_nets">Neural networks and deep learning</a></p> + +</div> +<div class="bodyitem"> + <span id=nlp-arxiv><p><a href="#nlp-arxiv">2018-05-08</a></p></span> + <p>Primer Science is a tool by a startup called Primer that uses NLP to summarize contents (but not single papers, yet) on arxiv. A developer of this tool predicts in <a href="https://twimlai.com/twiml-talk-136-taming-arxiv-w-natural-language-processing-with-john-bohannon/#">an interview</a> that progress on AI’s ability to extract meanings from AI research papers will be the biggest accelerant on AI research.</p> + +</div> +<div class="bodyitem"> + <span id=neural-nets-regularization><p><a href="#neural-nets-regularization">2018-05-08</a></p></span> + <blockquote> +<p>no-one has yet developed an entirely convincing theoretical explanation for why regularization helps networks generalize. Indeed, researchers continue to write papers where they try different approaches to regularization, compare them to see which works better, and attempt to understand why different approaches work better or worse. And so you can view regularization as something of a kludge. While it often helps, we don’t have an entirely satisfactory systematic understanding of what’s going on, merely incomplete heuristics and rules of thumb.</p> +<p>There’s a deeper set of issues here, issues which go to the heart of science. It’s the question of how we generalize. Regularization may give us a computational magic wand that helps our networks generalize better, but it doesn’t give us a principled understanding of how generalization works, nor of what the best approach is.</p> +</blockquote> +<p>Michael Nielsen, <a href="http://neuralnetworksanddeeplearning.com/chap3.html#why_does_regularization_help_reduce_overfitting">Neural networks and deep learning</a></p> + +</div> +<div class="bodyitem"> + <span id=sql-injection-video><p><a href="#sql-injection-video">2018-05-08</a></p></span> + <p>Computerphile has some brilliant educational videos on computer science, like <a href="https://www.youtube.com/watch?v=ciNHn38EyRc">a demo of SQL injection</a>, <a href="https://www.youtube.com/watch?v=eis11j_iGMs">a toy example of the lambda calculus</a>, and <a href="https://www.youtube.com/watch?v=9T8A89jgeTI">explaining the Y combinator</a>.</p> + +</div> +<div class="bodyitem"> + <span id=learning-knowledge-graph-reddit-journal-club><p><a href="#learning-knowledge-graph-reddit-journal-club">2018-05-07</a></p></span> + <h3 id="learning-via-knowledge-graph-and-reddit-journal-clubs">Learning via knowledge graph and reddit journal clubs</h3> +<p>It is a natural idea to look for ways to learn things like going through a skill tree in a computer RPG.</p> +<p>For example I made a <a href="https://ypei.me/posts/2015-04-02-juggling-skill-tree.html">DAG for juggling</a>.</p> +<p>Websites like <a href="https://knowen.org">Knowen</a> and <a href="https://metacademy.org">Metacademy</a> explore this idea with added flavour of open collaboration.</p> +<p>The design of Metacademy looks quite promising. It also has a nice tagline: “your package manager for knowledge”.</p> +<p>There are so so many tools to assist learning / research / knowledge sharing today, and we should keep experimenting, in the hope that eventually one of them will scale.</p> +<p>On another note, I often complain about the lack of a place to discuss math research online, but today I found on Reddit some journal clubs on machine learning: <a href="https://www.reddit.com/r/MachineLearning/comments/8aluhs/d_machine_learning_wayr_what_are_you_reading_week/">1</a>, <a href="https://www.reddit.com/r/MachineLearning/comments/8elmd8/d_anyone_having_trouble_reading_a_particular/">2</a>. If only we had this for maths. On the other hand r/math does have some interesting recurring threads as well: <a href="https://www.reddit.com/r/math/wiki/everythingaboutx">Everything about X</a> and <a href="https://www.reddit.com/r/math/search?q=what+are+you+working+on?+author:automoderator+&sort=new&restrict_sr=on&t=all">What Are You Working On?</a>. Hopefully these threads can last for years to come.</p> + +</div> +<div class="bodyitem"> + <span id=simple-solution-lack-of-math-rendering><p><a href="#simple-solution-lack-of-math-rendering">2018-05-02</a></p></span> + <h3 id="pastebin-for-the-win">Pastebin for the win</h3> +<p>The lack of maths rendering in major online communication platforms like instant messaging, email or Github has been a minor obsession of mine for quite a while, as I saw it as a big factor preventing people from talking more maths online. But today I realised this is totally a non-issue. Just do what people on IRC have been doing since the inception of the universe: use a (latex) pastebin.</p> + +</div> +<div class="bodyitem"> + <span id=neural-networks-programming-paradigm><p><a href="#neural-networks-programming-paradigm">2018-05-01</a></p></span> + <blockquote> +<p>Neural networks are one of the most beautiful programming paradigms ever invented. In the conventional approach to programming, we tell the computer what to do, breaking big problems up into many small, precisely defined tasks that the computer can easily perform. By contrast, in a neural network we don’t tell the computer how to solve our problem. Instead, it learns from observational data, figuring out its own solution to the problem at hand.</p> +</blockquote> +<p>Michael Nielsen - <a href="http://neuralnetworksanddeeplearning.com/about.html">What this book (Neural Networks and Deep Learning) is about</a></p> +<p>Unrelated to the quote, note that Nielsen’s book is licensed under <a href="https://creativecommons.org/licenses/by-nc/3.0/deed.en_GB">CC BY-NC</a>, so one can build on it and redistribute non-commercially.</p> + +</div> +<div class="bodyitem"> + <span id=google-search-not-ai><p><a href="#google-search-not-ai">2018-04-30</a></p></span> + <blockquote> +<p>But, users have learned to accommodate to Google not the other way around. We know what kinds of things we can type into Google and what we can’t and we keep our searches to things that Google is likely to help with. We know we are looking for texts and not answers to start a conversation with an entity that knows what we really need to talk about. People learn from conversation and Google can’t have one. It can pretend to have one using Siri but really those conversations tend to get tiresome when you are past asking about where to eat.</p> +</blockquote> +<p>Roger Schank - <a href="http://www.rogerschank.com/fraudulent-claims-made-by-IBM-about-Watson-and-AI">Fraudulent claims made by IBM about Watson and AI</a></p> + +</div> +<div class="bodyitem"> + <span id=hacker-ethics><p><a href="#hacker-ethics">2018-04-06</a></p></span> <blockquote> <ul> <li>Access to computers—and anything that might teach you something about the way the world works—should be unlimited and total. Always yield to the Hands-On Imperative!</li> @@ -34,7 +122,7 @@ </div> <div class="bodyitem"> - <p>2018-03-23</p> + <span id=static-site-generator><p><a href="#static-site-generator">2018-03-23</a></p></span> <blockquote> <p>“Static site generators seem like music databases, in that everyone eventually writes their own crappy one that just barely scratches the itch they had (and I’m no exception).”</p> </blockquote> diff --git a/site/posts/2018-04-10-update-open-research.html b/site/posts/2018-04-10-update-open-research.html index c76557f..edfcb39 100644 --- a/site/posts/2018-04-10-update-open-research.html +++ b/site/posts/2018-04-10-update-open-research.html @@ -38,8 +38,8 @@ <p>Posted on 2018-04-29</p> <p>It has been 9 months since I last wrote about open (maths) research. Since then two things happened which prompted me to write an update.</p> <p>As always I discuss open research only in mathematics, not because I think it should not be applied to other disciplines, but simply because I do not have experience nor sufficient interests in non-mathematical subjects.</p> -<p>First, I read about Richard Stallman the founder of the free software movement, in <a href="http://shop.oreilly.com/product/9780596002879.do">his biography by Sam Williams</a> and his own collection of essays <a href="https://shop.fsf.org/books-docs/free-software-free-society-selected-essays-richard-m-stallman-3rd-edition"><em>Free software, free society</em></a>, from which I learned a bit more about the context and philosophy of free software and open source software. For anyone interested in open research, I highly recommend having a look at these two books. I am also reading Levy’s <a href="http://www.stevenlevy.com/index.php/books/hackers">Hackers</a>, which documented the development of the hacker culture predating Stallman. I can see the connection of ideas from the hacker ethic to free software to the open source philosophy. My guess is that the software world is fortunate to have pioneers who advocated for freedom and openness from the beginning, whereas for academia which has a much longer history, credit protection has always been a bigger concern.</p> -<p>Also a month ago I attended a workshop called <a href="https://www.perimeterinstitute.ca/conferences/open-research-rethinking-scientific-collaboration">Open research: rethinking scientific collaboration</a>. That was the first time I met a group of people (mostly physicists) who also want open research to happen, and we had some stimulating discussions.</p> +<p>First, I read about Richard Stallman the founder of the free software movement, in <a href="http://shop.oreilly.com/product/9780596002879.do">his biography by Sam Williams</a> and his own collection of essays <a href="https://shop.fsf.org/books-docs/free-software-free-society-selected-essays-richard-m-stallman-3rd-edition"><em>Free software, free society</em></a>, from which I learned a bit more about the context and philosophy of free software and its relation to that of open source software. For anyone interested in open research, I highly recommend having a look at these two books. I am also reading Levy’s <a href="http://www.stevenlevy.com/index.php/books/hackers">Hackers</a>, which documented the development of the hacker culture predating Stallman. I can see the connection of ideas from the hacker ethic to the free software philosophy and to the open source philosophy. My guess is that the software world is fortunate to have pioneers who advocated for various kinds of freedom and openness from the beginning, whereas for academia which has a much longer history, credit protection has always been a bigger concern.</p> +<p>Also a month ago I attended a workshop called <a href="https://www.perimeterinstitute.ca/conferences/open-research-rethinking-scientific-collaboration">Open research: rethinking scientific collaboration</a>. That was the first time I met a group of people (mostly physicists) who also want open research to happen, and we had some stimulating discussions. Many thanks to the organisers at Perimeter Institute for organising the event, and special thanks to <a href="https://www.perimeterinstitute.ca/people/matteo-smerlak">Matteo Smerlak</a> and <a href="https://www.perimeterinstitute.ca/people/ashley-milsted">Ashley Milsted</a> for invitation and hosting.</p> <p>From both of these I feel like I should write an updated post on open research.</p> <h3 id="freedom-and-community">Freedom and community</h3> <p>Ideals matter. Stallman’s struggles stemmed from the frustration of denied request of source code (a frustration I shared in academia except source code is replaced by maths knowledge), and revolved around two things that underlie the free software movement: freedom and community. That is, the freedom to use, modify and share a work, and by sharing, to help the community.</p> |