diff options
author | Yuchen Pei <me@ypei.me> | 2021-06-18 12:58:44 +1000 |
---|---|---|
committer | Yuchen Pei <me@ypei.me> | 2021-06-18 12:58:44 +1000 |
commit | 147a19e84a743f1379f05bf2f444143b4afd7bd6 (patch) | |
tree | 3127395250cb958f06a98b86f73e77658150b43c /pages | |
parent | 4fa26fec8b7e978955e5630d3f820ba9c53be72c (diff) |
Updated.
Diffstat (limited to 'pages')
-rw-r--r-- | pages/all-microposts.org | 773 | ||||
-rw-r--r-- | pages/blog.org | 20 | ||||
-rw-r--r-- | pages/microblog.org | 683 |
3 files changed, 1476 insertions, 0 deletions
diff --git a/pages/all-microposts.org b/pages/all-microposts.org new file mode 100644 index 0000000..92896bc --- /dev/null +++ b/pages/all-microposts.org @@ -0,0 +1,773 @@ +#+title: Yuchen's Microblog + +*** 2020-08-02: ia-lawsuit + :PROPERTIES: + :CUSTOM_ID: ia-lawsuit + :END: +The four big publishers Hachette, HarperCollins, Wiley, and Penguin +Random House are still pursuing Internet Archive. + +#+begin_quote + [Their] lawsuit does not stop at seeking to end the practice of + Controlled Digital Lending. These publishers call for the destruction + of the 1.5 million digital books that Internet Archive makes available + to our patrons. This form of digital book burning is unprecedented and + unfairly disadvantages people with print disabilities. For the blind, + ebooks are a lifeline, yet less than one in ten exists in accessible + formats. Since 2010, Internet Archive has made our lending library + available to the blind and print disabled community, in addition to + sighted users. If the publishers are successful with their lawsuit, + more than a million of those books would be deleted from the + Internet's digital shelves forever. +#+end_quote + +[[https://blog.archive.org/2020/07/29/internet-archive-responds-to-publishers-lawsuit/][Libraries +lend books, and must continue to lend books: Internet Archive responds +to publishers' lawsuit]] +*** 2020-08-02: fsf-membership + :PROPERTIES: + :CUSTOM_ID: fsf-membership + :END: +I am a proud associate member of Free Software Freedom. For me the +philosophy of Free Software is about ensuring the enrichment of a +digital commons, so that knowledge and information are not concentrated +in the hands of selected privileged people and locked up as +"intellectual property". The genius of copyleft licenses like GNU (A)GPL +ensures software released for the public, remains public. Open source +does not care about that. + +If you also care about the public good, the hacker ethics, or the spirit +of the web, please take a moment to consider joining FSF as an associate +member. It comes with [[https://www.fsf.org/associate/benefits][numerous +perks and benefits]]. +*** 2020-06-21: how-can-you-help-ia + :PROPERTIES: + :CUSTOM_ID: how-can-you-help-ia + :END: +[[https://blog.archive.org/2020/06/14/how-can-you-help-the-internet-archive/][How +can you help the Internet Archive?]] Use it. It's more than the Wayback +Machine. And get involved. +*** 2020-06-12: open-library + :PROPERTIES: + :CUSTOM_ID: open-library + :END: +Open Library was cofounded by Aaron Swartz. As part of the Internet +Archive, it has done good work to spread knowledge. However it is +currently +[[https://arstechnica.com/tech-policy/2020/06/internet-archive-ends-emergency-library-early-to-appease-publishers/][being +sued by four major publishers]] for the +[[https://archive.org/details/nationalemergencylibrary][National +Emergency Library]]. IA decided to +[[https://blog.archive.org/2020/06/10/temporary-national-emergency-library-to-close-2-weeks-early-returning-to-traditional-controlled-digital-lending/][close +the NEL two weeks earlier than planned]], but the lawsuit is not over, +which in the worst case scenario has the danger of resulting in +Controlled Digital Lending being considered illegal and (less likely) +bancruptcy of the Internet Archive. If this happens it will be a big +setback of the free-culture movement. +*** 2020-04-15: sanders-suspend-campaign + :PROPERTIES: + :CUSTOM_ID: sanders-suspend-campaign + :END: +Suspending the campaign is different from dropping out of the race. +Bernie Sanders remains on the ballot, and indeed in his campaign +suspension speech he encouraged people to continue voting for him in the +democratic primaries to push for changes in the convention. +*** 2019-09-30: defense-stallman + :PROPERTIES: + :CUSTOM_ID: defense-stallman + :END: +Someone wrote a bold article titled +[[https://geoff.greer.fm/2019/09/30/in-defense-of-richard-stallman/]["In +Defense of Richard Stallman"]]. Kudos to him. + +Also, an interesting read: +[[https://cfenollosa.com/blog/famous-computer-public-figure-suffers-the-consequences-for-asshole-ish-behavior.html][Famous +public figure in tech suffers the consequences for asshole-ish +behavior]]. +*** 2019-09-29: stallman-resign + :PROPERTIES: + :CUSTOM_ID: stallman-resign + :END: +Last week Richard Stallman resigned from FSF. It is a great loss for the +free software movement. + +The apparent cause of his resignation and the events that triggered it +reflect some alarming trends of the zeitgeist. Here is a detailed review +of what happened: [[https://sterling-archermedes.github.io/][Low grade +"journalists" and internet mob attack RMS with lies. In-depth review.]]. +Some interesting articles on this are: +[[https://jackbaruth.com/?p=16779][Weekly Roundup: The Passion Of Saint +iGNUcius Edition]], +[[http://techrights.org/2019/09/17/rms-witch-hunt/][Why I Once Called +for Richard Stallman to Step Down]]. + +Dishonest and misleading media pieces involved in this incident include +[[https://www.thedailybeast.com/famed-mit-computer-scientist-richard-stallman-defends-epstein-victims-were-entirely-willing][The +Daily Beast]], +[[https://www.vice.com/en_us/article/9ke3ke/famed-computer-scientist-richard-stallman-described-epstein-victims-as-entirely-willing][Vice]], +[[https://techcrunch.com/2019/09/16/computer-scientist-richard-stallman-who-defended-jeffrey-epstein-resigns-from-mit-csail-and-the-free-software-foundation/][Tech +Crunch]], +[[https://www.wired.com/story/richard-stallmans-exit-heralds-a-new-era-in-tech/][Wired]]. +*** 2019-03-16: decss-haiku + :PROPERTIES: + :CUSTOM_ID: decss-haiku + :END: + +#+begin_quote + #+begin_example + Muse! When we learned to + count, little did we know all + the things we could do + + some day by shuffling + those numbers: Pythagoras + said "All is number" + + long before he saw + computers and their effects, + or what they could do + + by computation, + naive and mechanical + fast arithmetic. + + It changed the world, it + changed our consciousness and lives + to have such fast math + + available to + us and anyone who cared + to learn programming. + + Now help me, Muse, for + I wish to tell a piece of + controversial math, + + for which the lawyers + of DVD CCA + don't forbear to sue: + + that they alone should + know or have the right to teach + these skills and these rules. + + (Do they understand + the content, or is it just + the effects they see?) + + And all mathematics + is full of stories (just read + Eric Temple Bell); + + and CSS is + no exception to this rule. + Sing, Muse, decryption + + once secret, as all + knowledge, once unknown: how to + decrypt DVDs. + #+end_example +#+end_quote + +Seth Schoen, [[https://en.wikipedia.org/wiki/DeCSS_haiku][DeCSS haiku]] +*** 2019-01-27: learning-undecidable + :PROPERTIES: + :CUSTOM_ID: learning-undecidable + :END: +My take on the +[[https://www.nature.com/articles/s42256-018-0002-3][Nature paper +/Learning can be undecidable/]]: + +Fantastic article, very clearly written. + +So it reduces a kind of learninability called estimating the maximum +(EMX) to the cardinality of real numbers which is undecidable. + +When it comes to the relation between EMX and the rest of machine +learning framework, the article mentions that EMX belongs to "extensions +of PAC learnability include Vapnik's statistical learning setting and +the equivalent general learning setting by Shalev-Shwartz and +colleagues" (I have no idea what these two things are), but it does not +say whether EMX is representative of or reduces to common learning +tasks. So it is not clear whether its undecidability applies to ML at +large. + +Another condition to the main theorem is the union bounded closure +assumption. It seems a reasonable property of a family of sets, but then +again I wonder how that translates to learning. + +The article says "By now, we know of quite a few independence [from +mathematical axioms] results, mostly for set theoretic questions like +the continuum hypothesis, but also for results in algebra, analysis, +infinite combinatorics and more. Machine learning, so far, has escaped +this fate." but the description of the EMX learnability makes it more +like a classical mathematical / theoretical computer science problem +rather than machine learning. + +An insightful conclusion: "How come learnability can neither be proved +nor refuted? A closer look reveals that the source of the problem is in +defining learnability as the existence of a learning function rather +than the existence of a learning algorithm. In contrast with the +existence of algorithms, the existence of functions over infinite +domains is a (logically) subtle issue." + +In relation to practical problems, it uses an example of ad targeting. +However, A lot is lost in translation from the main theorem to this ad +example. + +The EMX problem states: given a domain X, a distribution P over X which +is unknown, some samples from P, and a family of subsets of X called F, +find A in F that approximately maximises P(A). + +The undecidability rests on X being the continuous [0, 1] interval, and +from the insight, we know the problem comes from the cardinality of +subsets of the [0, 1] interval, which is "logically subtle". + +In the ad problem, the domain X is all potential visitors, which is +finite because there are finite number of people in the world. In this +case P is a categorical distribution over the 1..n where n is the +population of the world. One can have a good estimate of the parameters +of a categorical distribution by asking for sufficiently large number of +samples and computing the empirical distribution. Let's call the +estimated distribution Q. One can choose the from F (also finite) the +set that maximises Q(A) which will be a solution to EMX. + +In other words, the theorem states: EMX is undecidable because not all +EMX instances are decidable, because there are some nasty ones due to +infinities. That does not mean no EMX instance is decidable. And I think +the ad instance is decidable. Is there a learning task that actually +corresponds to an undecidable EMX instance? I don't know, but I will not +believe the result of this paper is useful until I see one. + +h/t Reynaldo Boulogne +*** 2018-12-11: gavin-belson + :PROPERTIES: + :CUSTOM_ID: gavin-belson + :END: + +#+begin_quote + I don't know about you people, but I don't want to live in a world + where someone else makes the world a better place better than we do. +#+end_quote + +Gavin Belson, Silicon Valley S2E1. + +I came across this quote in +[[https://slate.com/business/2018/12/facebook-emails-lawsuit-embarrassing-mark-zuckerberg.html][a +Slate post about Facebook]] +*** 2018-10-05: margins + :PROPERTIES: + :CUSTOM_ID: margins + :END: +With Fermat's Library's new tool +[[https://fermatslibrary.com/margins][margins]], you can host your own +journal club. +*** 2018-09-18: rnn-turing + :PROPERTIES: + :CUSTOM_ID: rnn-turing + :END: +Just some non-rigorous guess / thought: Feedforward networks are like +combinatorial logic, and recurrent networks are like sequential logic +(e.g. data flip-flop is like the feedback connection in RNN). Since NAND ++ combinatorial logic + sequential logic = von Neumann machine which is +an approximation of the Turing machine, it is not surprising that RNN +(with feedforward networks) is Turing complete (assuming that neural +networks can learn the NAND gate). +*** 2018-09-07: zitierkartell + :PROPERTIES: + :CUSTOM_ID: zitierkartell + :END: +[[https://academia.stackexchange.com/questions/116489/counter-strategy-against-group-that-repeatedly-does-strategic-self-citations-and][Counter +strategy against group that repeatedly does strategic self-citations and +ignores other relevant research]] +*** 2018-09-05: short-science + :PROPERTIES: + :CUSTOM_ID: short-science + :END: + +#+begin_quote + + - ShortScience.org is a platform for post-publication discussion + aiming to improve accessibility and reproducibility of research + ideas. + - The website has over 800 summaries, mostly in machine learning, + written by the community and organized by paper, conference, and + year. + - Reading summaries of papers is useful to obtain the perspective and + insight of another reader, why they liked or disliked it, and their + attempt to demystify complicated sections. + - Also, writing summaries is a good exercise to understand the content + of a paper because you are forced to challenge your assumptions when + explaining it. + - Finally, you can keep up to date with the flood of research by + reading the latest summaries on our Twitter and Facebook pages. +#+end_quote + +[[https://shortscience.org][ShortScience.org]] +*** 2018-08-13: darknet-diaries + :PROPERTIES: + :CUSTOM_ID: darknet-diaries + :END: +[[https://darknetdiaries.com][Darknet Diaries]] is a cool podcast. +According to its about page it covers "true stories from the dark side +of the Internet. Stories about hackers, defenders, threats, malware, +botnets, breaches, and privacy." +*** 2018-06-20: coursera-basic-income + :PROPERTIES: + :CUSTOM_ID: coursera-basic-income + :END: +Coursera is having +[[https://www.coursera.org/learn/exploring-basic-income-in-a-changing-economy][a +Teach-Out on Basic Income]]. +*** 2018-06-19: pun-generator + :PROPERTIES: + :CUSTOM_ID: pun-generator + :END: +[[https://en.wikipedia.org/wiki/Computational_humor#Pun_generation][Pun +generators exist]]. +*** 2018-06-15: hackers-excerpt + :PROPERTIES: + :CUSTOM_ID: hackers-excerpt + :END: + +#+begin_quote + But as more nontechnical people bought computers, the things that + impressed hackers were not as essential. While the programs themselves + had to maintain a certain standard of quality, it was quite possible + that the most exacting standards---those applied by a hacker who + wanted to add one more feature, or wouldn't let go of a project until + it was demonstrably faster than anything else around---were probably + counterproductive. What seemed more important was marketing. There + were plenty of brilliant programs which no one knew about. Sometimes + hackers would write programs and put them in the public domain, give + them away as easily as John Harris had lent his early copy of + Jawbreaker to the guys at the Fresno computer store. But rarely would + people ask for public domain programs by name: they wanted the ones + they saw advertised and discussed in magazines, demonstrated in + computer stores. It was not so important to have amazingly clever + algorithms. Users would put up with more commonplace ones. + + The Hacker Ethic, of course, held that every program should be as good + as you could make it (or better), infinitely flexible, admired for its + brilliance of concept and execution, and designed to extend the user's + powers. Selling computer programs like toothpaste was heresy. But it + was happening. Consider the prescription for success offered by one of + a panel of high-tech venture capitalists, gathered at a 1982 software + show: "I can summarize what it takes in three words: marketing, + marketing, marketing." When computers are sold like toasters, programs + will be sold like toothpaste. The Hacker Ethic notwithstanding. +#+end_quote + +[[http://www.stevenlevy.com/index.php/books/hackers][Hackers: Heroes of +Computer Revolution]], by Steven Levy. +*** 2018-06-11: catalan-overflow + :PROPERTIES: + :CUSTOM_ID: catalan-overflow + :END: +To compute Catalan numbers without unnecessary overflow, use the +recurrence formula \(C_n = {4 n - 2 \over n + 1} C_{n - 1}\). +*** 2018-06-04: boyer-moore + :PROPERTIES: + :CUSTOM_ID: boyer-moore + :END: +The +[[https://en.wikipedia.org/wiki/Boyer–Moore_majority_vote_algorithm][Boyer-Moore +algorithm for finding the majority of a sequence of elements]] falls in +the category of "very clever algorithms". + +#+begin_example + int majorityElement(vector<int>& xs) { + int count = 0; + int maj = xs[0]; + for (auto x : xs) { + if (x == maj) count++; + else if (count == 0) maj = x; + else count--; + } + return maj; + } +#+end_example +*** 2018-05-30: how-to-learn-on-your-own + :PROPERTIES: + :CUSTOM_ID: how-to-learn-on-your-own + :END: +Roger Grosse's post +[[https://metacademy.org/roadmaps/rgrosse/learn_on_your_own][How to +learn on your own (2015)]] is an excellent modern guide on how to learn +and research technical stuff (especially machine learning and maths) on +one's own. +*** 2018-05-25: 2048-mdp + :PROPERTIES: + :CUSTOM_ID: 2048-mdp + :END: +[[http://jdlm.info/articles/2018/03/18/markov-decision-process-2048.html][This +post]] models 2048 as an MDP and solves it using policy iteration and +backward induction. +*** 2018-05-22: ats + :PROPERTIES: + :CUSTOM_ID: ats + :END: + +#+begin_quote + ATS (Applied Type System) is a programming language designed to unify + programming with formal specification. ATS has support for combining + theorem proving with practical programming through the use of advanced + type systems. A past version of The Computer Language Benchmarks Game + has demonstrated that the performance of ATS is comparable to that of + the C and C++ programming languages. By using theorem proving and + strict type checking, the compiler can detect and prove that its + implemented functions are not susceptible to bugs such as division by + zero, memory leaks, buffer overflow, and other forms of memory + corruption by verifying pointer arithmetic and reference counting + before the program compiles. Additionally, by using the integrated + theorem-proving system of ATS (ATS/LF), the programmer may make use of + static constructs that are intertwined with the operative code to + prove that a function attains its specification. +#+end_quote + +[[https://en.wikipedia.org/wiki/ATS_(programming_language)][Wikipedia +entry on ATS]] +*** 2018-05-20: bostoncalling + :PROPERTIES: + :CUSTOM_ID: bostoncalling + :END: +(5-second fame) I sent a picture of my kitchen sink to BBC and got +mentioned in the [[https://www.bbc.co.uk/programmes/w3cswg8c][latest +Boston Calling episode]] (listen at 25:54). +*** 2018-05-18: colah-blog + :PROPERTIES: + :CUSTOM_ID: colah-blog + :END: +[[https://colah.github.io/][colah's blog]] has a cool feature that +allows you to comment on any paragraph of a blog post. Here's an +[[https://colah.github.io/posts/2015-08-Understanding-LSTMs/][example]]. +If it is doable on a static site hosted on Github pages, I suppose it +shouldn't be too hard to implement. This also seems to work more +seamlessly than [[https://fermatslibrary.com/][Fermat's Library]], +because the latter has to embed pdfs in webpages. Now fantasy time: +imagine that one day arXiv shows html versions of papers (through author +uploading or conversion from TeX) with this feature. +*** 2018-05-15: random-forests + :PROPERTIES: + :CUSTOM_ID: random-forests + :END: +[[https://lagunita.stanford.edu/courses/HumanitiesSciences/StatLearning/Winter2016/info][Stanford +Lagunita's statistical learning course]] has some excellent lectures on +random forests. It starts with explanations of decision trees, followed +by bagged trees and random forests, and ends with boosting. From these +lectures it seems that: + +1. The term "predictors" in statistical learning = "features" in machine + learning. +2. The main idea of random forests of dropping predictors for individual + trees and aggregate by majority or average is the same as the idea of + dropout in neural networks, where a proportion of neurons in the + hidden layers are dropped temporarily during different minibatches of + training, effectively averaging over an emsemble of subnetworks. Both + tricks are used as regularisations, i.e. to reduce the variance. The + only difference is: in random forests, all but a square root number + of the total number of features are dropped, whereas the dropout + ratio in neural networks is usually a half. + +By the way, here's a comparison between statistical learning and machine +learning from the slides of the Statistcal Learning course: +*** 2018-05-14: open-review-net + :PROPERTIES: + :CUSTOM_ID: open-review-net + :END: +Open peer review means peer review process where communications +e.g. comments and responses are public. + +Like [[https://scipost.org/][SciPost]] mentioned in +[[/posts/2018-04-10-update-open-research.html][my post]], +[[https://openreview.net][OpenReview.net]] is an example of open peer +review in research. It looks like their focus is machine learning. Their +[[https://openreview.net/about][about page]] states their mission, and +here's [[https://openreview.net/group?id=ICLR.cc/2018/Conference][an +example]] where you can click on each entry to see what it is like. We +definitely need this in the maths research community. +*** 2018-05-11: rnn-fsm + :PROPERTIES: + :CUSTOM_ID: rnn-fsm + :END: +Related to [[#neural-turing-machine][a previous micropost]]. + +[[http://www.cs.toronto.edu/~rgrosse/csc321/lec9.pdf][These slides from +Toronto]] are a nice introduction to RNN (recurrent neural network) from +a computational point of view. It states that RNN can simulate any FSM +(finite state machine, a.k.a. finite automata abbr. FA) with a toy +example computing the parity of a binary string. + +[[http://www.deeplearningbook.org/contents/rnn.html][Goodfellow et. +al.'s book]] (see page 372 and 374) goes one step further, stating that +RNN with a hidden-to-hidden layer can simulate Turing machines, and not +only that, but also the /universal/ Turing machine abbr. UTM (the book +referenced +[[https://www.sciencedirect.com/science/article/pii/S0022000085710136][Siegelmann-Sontag]]), +a property not shared by the weaker network where the hidden-to-hidden +layer is replaced by an output-to-hidden layer (page 376). + +By the way, the RNN with a hidden-to-hidden layer has the same +architecture as the so-called linear dynamical system mentioned in +[[https://www.coursera.org/learn/neural-networks/lecture/Fpa7y/modeling-sequences-a-brief-overview][Hinton's +video]]. + +From what I have learned, the universality of RNN and feedforward +networks are therefore due to different arguments, the former coming +from Turing machines and the latter from an analytical view of +approximation by step functions. +*** 2018-05-10: math-writing-decoupling + :PROPERTIES: + :CUSTOM_ID: math-writing-decoupling + :END: +One way to write readable mathematics is to decouple concepts. One idea +is the following template. First write a toy example with all the +important components present in this example, then analyse each +component individually and elaborate how (perhaps more complex) +variations of the component can extend the toy example and induce more +complex or powerful versions of the toy example. Through such +incremental development, one should be able to arrive at any result in +cutting edge research after a pleasant journey. + +It's a bit like the UNIX philosophy, where you have a basic system of +modules like IO, memory management, graphics etc, and modify / improve +each module individually (H/t [[http://nand2tetris.org/][NAND2Tetris]]). + +The book [[http://neuralnetworksanddeeplearning.com/][Neutral networks +and deep learning]] by Michael Nielsen is an example of such approach. +It begins the journey with a very simple neutral net with one hidden +layer, no regularisation, and sigmoid activations. It then analyses each +component including cost functions, the back propagation algorithm, the +activation functions, regularisation and the overall architecture (from +fully connected to CNN) individually and improve the toy example +incrementally. Over the course the accuracy of the example of mnist +grows incrementally from 95.42% to 99.67%. +*** 2018-05-09: neural-turing-machine + :PROPERTIES: + :CUSTOM_ID: neural-turing-machine + :END: + +#+begin_quote + One way RNNs are currently being used is to connect neural networks + more closely to traditional ways of thinking about algorithms, ways of + thinking based on concepts such as Turing machines and (conventional) + programming languages. [[https://arxiv.org/abs/1410.4615][A 2014 + paper]] developed an RNN which could take as input a + character-by-character description of a (very, very simple!) Python + program, and use that description to predict the output. Informally, + the network is learning to "understand" certain Python programs. + [[https://arxiv.org/abs/1410.5401][A second paper, also from 2014]], + used RNNs as a starting point to develop what they called a neural + Turing machine (NTM). This is a universal computer whose entire + structure can be trained using gradient descent. They trained their + NTM to infer algorithms for several simple problems, such as sorting + and copying. + + As it stands, these are extremely simple toy models. Learning to + execute the Python program =print(398345+42598)= doesn't make a + network into a full-fledged Python interpreter! It's not clear how + much further it will be possible to push the ideas. Still, the results + are intriguing. Historically, neural networks have done well at + pattern recognition problems where conventional algorithmic approaches + have trouble. Vice versa, conventional algorithmic approaches are good + at solving problems that neural nets aren't so good at. No-one today + implements a web server or a database program using a neural network! + It'd be great to develop unified models that integrate the strengths + of both neural networks and more traditional approaches to algorithms. + RNNs and ideas inspired by RNNs may help us do that. +#+end_quote + +Michael Nielsen, +[[http://neuralnetworksanddeeplearning.com/chap6.html#other_approaches_to_deep_neural_nets][Neural +networks and deep learning]] +*** 2018-05-09: neural-nets-activation + :PROPERTIES: + :CUSTOM_ID: neural-nets-activation + :END: + +#+begin_quote + What makes the rectified linear activation function better than the + sigmoid or tanh functions? At present, we have a poor understanding of + the answer to this question. Indeed, rectified linear units have only + begun to be widely used in the past few years. The reason for that + recent adoption is empirical: a few people tried rectified linear + units, often on the basis of hunches or heuristic arguments. They got + good results classifying benchmark data sets, and the practice has + spread. In an ideal world we'd have a theory telling us which + activation function to pick for which application. But at present + we're a long way from such a world. I should not be at all surprised + if further major improvements can be obtained by an even better choice + of activation function. And I also expect that in coming decades a + powerful theory of activation functions will be developed. Today, we + still have to rely on poorly understood rules of thumb and experience. +#+end_quote + +Michael Nielsen, +[[http://neuralnetworksanddeeplearning.com/chap6.html#convolutional_neural_networks_in_practice][Neutral +networks and deep learning]] +*** 2018-05-08: sql-injection-video + :PROPERTIES: + :CUSTOM_ID: sql-injection-video + :END: +Computerphile has some brilliant educational videos on computer science, +like [[https://www.youtube.com/watch?v=ciNHn38EyRc][a demo of SQL +injection]], [[https://www.youtube.com/watch?v=eis11j_iGMs][a toy +example of the lambda calculus]], and +[[https://www.youtube.com/watch?v=9T8A89jgeTI][explaining the Y +combinator]]. +*** 2018-05-08: nlp-arxiv + :PROPERTIES: + :CUSTOM_ID: nlp-arxiv + :END: +Primer Science is a tool by a startup called Primer that uses NLP to +summarize contents (but not single papers, yet) on arxiv. A developer of +this tool predicts in +[[https://twimlai.com/twiml-talk-136-taming-arxiv-w-natural-language-processing-with-john-bohannon/#][an +interview]] that progress on AI's ability to extract meanings from AI +research papers will be the biggest accelerant on AI research. +*** 2018-05-08: neural-nets-regularization + :PROPERTIES: + :CUSTOM_ID: neural-nets-regularization + :END: + +#+begin_quote + no-one has yet developed an entirely convincing theoretical + explanation for why regularization helps networks generalize. Indeed, + researchers continue to write papers where they try different + approaches to regularization, compare them to see which works better, + and attempt to understand why different approaches work better or + worse. And so you can view regularization as something of a kludge. + While it often helps, we don't have an entirely satisfactory + systematic understanding of what's going on, merely incomplete + heuristics and rules of thumb. + + There's a deeper set of issues here, issues which go to the heart of + science. It's the question of how we generalize. Regularization may + give us a computational magic wand that helps our networks generalize + better, but it doesn't give us a principled understanding of how + generalization works, nor of what the best approach is. +#+end_quote + +Michael Nielsen, +[[http://neuralnetworksanddeeplearning.com/chap3.html#why_does_regularization_help_reduce_overfitting][Neural +networks and deep learning]] +*** 2018-05-07: learning-knowledge-graph-reddit-journal-club + :PROPERTIES: + :CUSTOM_ID: learning-knowledge-graph-reddit-journal-club + :END: +It is a natural idea to look for ways to learn things like going through +a skill tree in a computer RPG. + +For example I made a +[[https://ypei.me/posts/2015-04-02-juggling-skill-tree.html][DAG for +juggling]]. + +Websites like [[https://knowen.org][Knowen]] and +[[https://metacademy.org][Metacademy]] explore this idea with added +flavour of open collaboration. + +The design of Metacademy looks quite promising. It also has a nice +tagline: "your package manager for knowledge". + +There are so so many tools to assist learning / research / knowledge +sharing today, and we should keep experimenting, in the hope that +eventually one of them will scale. + +On another note, I often complain about the lack of a place to discuss +math research online, but today I found on Reddit some journal clubs on +machine learning: +[[https://www.reddit.com/r/MachineLearning/comments/8aluhs/d_machine_learning_wayr_what_are_you_reading_week/][1]], +[[https://www.reddit.com/r/MachineLearning/comments/8elmd8/d_anyone_having_trouble_reading_a_particular/][2]]. +If only we had this for maths. On the other hand r/math does have some +interesting recurring threads as well: +[[https://www.reddit.com/r/math/wiki/everythingaboutx][Everything about +X]] and +[[https://www.reddit.com/r/math/search?q=what+are+you+working+on?+author:automoderator+&sort=new&restrict_sr=on&t=all][What +Are You Working On?]]. Hopefully these threads can last for years to +come. +*** 2018-05-02: simple-solution-lack-of-math-rendering + :PROPERTIES: + :CUSTOM_ID: simple-solution-lack-of-math-rendering + :END: +The lack of maths rendering in major online communication platforms like +instant messaging, email or Github has been a minor obsession of mine +for quite a while, as I saw it as a big factor preventing people from +talking more maths online. But today I realised this is totally a +non-issue. Just do what people on IRC have been doing since the +inception of the universe: use a (latex) pastebin. +*** 2018-05-01: neural-networks-programming-paradigm + :PROPERTIES: + :CUSTOM_ID: neural-networks-programming-paradigm + :END: + +#+begin_quote + Neural networks are one of the most beautiful programming paradigms + ever invented. In the conventional approach to programming, we tell + the computer what to do, breaking big problems up into many small, + precisely defined tasks that the computer can easily perform. By + contrast, in a neural network we don't tell the computer how to solve + our problem. Instead, it learns from observational data, figuring out + its own solution to the problem at hand. +#+end_quote + +Michael Nielsen - +[[http://neuralnetworksanddeeplearning.com/about.html][What this book +(Neural Networks and Deep Learning) is about]] + +Unrelated to the quote, note that Nielsen's book is licensed under +[[https://creativecommons.org/licenses/by-nc/3.0/deed.en_GB][CC BY-NC]], +so one can build on it and redistribute non-commercially. +*** 2018-04-30: google-search-not-ai + :PROPERTIES: + :CUSTOM_ID: google-search-not-ai + :END: + +#+begin_quote + But, users have learned to accommodate to Google not the other way + around. We know what kinds of things we can type into Google and what + we can't and we keep our searches to things that Google is likely to + help with. We know we are looking for texts and not answers to start a + conversation with an entity that knows what we really need to talk + about. People learn from conversation and Google can't have one. It + can pretend to have one using Siri but really those conversations tend + to get tiresome when you are past asking about where to eat. +#+end_quote + +Roger Schank - +[[http://www.rogerschank.com/fraudulent-claims-made-by-IBM-about-Watson-and-AI][Fraudulent +claims made by IBM about Watson and AI]] +*** 2018-04-06: hacker-ethics + :PROPERTIES: + :CUSTOM_ID: hacker-ethics + :END: + +#+begin_quote + + - Access to computers---and anything that might teach you something + about the way the world works---should be unlimited and total. + Always yield to the Hands-On Imperative! + - All information should be free. + - Mistrust Authority---Promote Decentralization. + - Hackers should be judged by their hacking, not bogus criteria such + as degrees, age, race, or position. + - You can create art and beauty on a computer. + - Computers can change your life for the better. +#+end_quote + +[[https://en.wikipedia.org/wiki/Hacker_ethic][The Hacker Ethic]], +[[https://en.wikipedia.org/wiki/Hackers:_Heroes_of_the_Computer_Revolution][Hackers: +Heroes of Computer Revolution]], by Steven Levy +*** 2018-03-23: static-site-generator + :PROPERTIES: + :CUSTOM_ID: static-site-generator + :END: + +#+begin_quote + "Static site generators seem like music databases, in that everyone + eventually writes their own crappy one that just barely scratches the + itch they had (and I'm no exception)." +#+end_quote + +__david__@hackernews + +So did I. diff --git a/pages/blog.org b/pages/blog.org new file mode 100644 index 0000000..d8928f5 --- /dev/null +++ b/pages/blog.org @@ -0,0 +1,20 @@ +#+TITLE: All posts + +- *[[file:posts/2019-03-14-great-but-manageable-expectations.org][Great but Manageable Expectations]]* - 2019-03-14 +- *[[file:posts/2019-03-13-a-tail-of-two-densities.org][A Tail of Two Densities]]* - 2019-03-13 +- *[[file:posts/2019-02-14-raise-your-elbo.org][Raise your ELBO]]* - 2019-02-14 +- *[[file:posts/2019-01-03-discriminant-analysis.org][Discriminant analysis]]* - 2019-01-03 +- *[[file:posts/2018-12-02-lime-shapley.org][Shapley, LIME and SHAP]]* - 2018-12-02 +- *[[file:posts/2018-06-03-automatic_differentiation.org][Automatic differentiation]]* - 2018-06-03 +- *[[file:posts/2018-04-10-update-open-research.org][Updates on open research]]* - 2018-04-29 +- *[[file:posts/2017-08-07-mathematical_bazaar.org][The Mathematical Bazaar]]* - 2017-08-07 +- *[[file:posts/2017-04-25-open_research_toywiki.org][Open mathematical research and launching toywiki]]* - 2017-04-25 +- *[[file:posts/2016-10-13-q-robinson-schensted-knuth-polymer.org][A \(q\)-Robinson-Schensted-Knuth algorithm and a \(q\)-polymer]]* - 2016-10-13 +- *[[file:posts/2015-07-15-double-macdonald-polynomials-macdonald-superpolynomials.org][AMS review of 'Double Macdonald polynomials as the stable limit of Macdonald superpolynomials' by Blondeau-Fournier, Lapointe and Mathieu]]* - 2015-07-15 +- *[[file:posts/2015-07-01-causal-quantum-product-levy-area.org][On a causal quantum double product integral related to Lévy stochastic area.]]* - 2015-07-01 +- *[[file:posts/2015-05-30-infinite-binary-words-containing-repetitions-odd-periods.org][AMS review of 'Infinite binary words containing repetitions of odd period' by Badkobeh and Crochemore]]* - 2015-05-30 +- *[[file:posts/2015-04-02-juggling-skill-tree.org][jst]]* - 2015-04-02 +- *[[file:posts/2015-04-01-unitary-double-products.org][Unitary causal quantum stochastic double products as universal]]* - 2015-04-01 +- *[[file:posts/2015-01-20-weighted-interpretation-super-catalan-numbers.org][AMS review of 'A weighted interpretation for the super Catalan]]* - 2015-01-20 +- *[[file:posts/2014-04-01-q-robinson-schensted-symmetry-paper.org][Symmetry property of \(q\)-weighted Robinson-Schensted algorithms and branching algorithms]]* - 2014-04-01 +- *[[file:posts/2013-06-01-q-robinson-schensted-paper.org][A \(q\)-weighted Robinson-Schensted algorithm]]* - 2013-06-01
\ No newline at end of file diff --git a/pages/microblog.org b/pages/microblog.org new file mode 100644 index 0000000..fb39a67 --- /dev/null +++ b/pages/microblog.org @@ -0,0 +1,683 @@ +#+TITLE: Microblog + +- 2020-08-02 - *[[file:microposts/ia-lawsuit.org][ia-lawsuit]]* + + The four big publishers Hachette, HarperCollins, Wiley, and Penguin + Random House are still pursuing Internet Archive. + + #+begin_quote + [Their] lawsuit does not stop at seeking to end the practice of + Controlled Digital Lending. These publishers call for the destruction + of the 1.5 million digital books that Internet Archive makes available + to our patrons. This form of digital book burning is unprecedented and + unfairly disadvantages people with print disabilities. For the blind, + ebooks are a lifeline, yet less than one in ten exists in accessible + formats. Since 2010, Internet Archive has made our lending library + available to the blind and print disabled community, in addition to + sighted users. If the publishers are successful with their lawsuit, + more than a million of those books would be deleted from the + Internet's digital shelves forever. + #+end_quote + + [[https://blog.archive.org/2020/07/29/internet-archive-responds-to-publishers-lawsuit/][Libraries + lend books, and must continue to lend books: Internet Archive responds + to publishers' lawsuit]] +- 2020-08-02 - *[[file:microposts/fsf-membership.org][fsf-membership]]* + + I am a proud associate member of Free Software Freedom. For me the + philosophy of Free Software is about ensuring the enrichment of a + digital commons, so that knowledge and information are not concentrated + in the hands of selected privileged people and locked up as + "intellectual property". The genius of copyleft licenses like GNU (A)GPL + ensures software released for the public, remains public. Open source + does not care about that. + + If you also care about the public good, the hacker ethics, or the spirit + of the web, please take a moment to consider joining FSF as an associate + member. It comes with [[https://www.fsf.org/associate/benefits][numerous + perks and benefits]]. +- 2020-06-21 - *[[file:microposts/how-can-you-help-ia.org][how-can-you-help-ia]]* + + [[https://blog.archive.org/2020/06/14/how-can-you-help-the-internet-archive/][How + can you help the Internet Archive?]] Use it. It's more than the Wayback + Machine. And get involved. +- 2020-06-12 - *[[file:microposts/open-library.org][open-library]]* + + Open Library was cofounded by Aaron Swartz. As part of the Internet + Archive, it has done good work to spread knowledge. However it is + currently + [[https://arstechnica.com/tech-policy/2020/06/internet-archive-ends-emergency-library-early-to-appease-publishers/][being + sued by four major publishers]] for the + [[https://archive.org/details/nationalemergencylibrary][National + Emergency Library]]. IA decided to + [[https://blog.archive.org/2020/06/10/temporary-national-emergency-library-to-close-2-weeks-early-returning-to-traditional-controlled-digital-lending/][close + the NEL two weeks earlier than planned]], but the lawsuit is not over, + which in the worst case scenario has the danger of resulting in + Controlled Digital Lending being considered illegal and (less likely) + bancruptcy of the Internet Archive. If this happens it will be a big + setback of the free-culture movement. +- 2020-04-15 - *[[file:microposts/sanders-suspend-campaign.org][sanders-suspend-campaign]]* + + Suspending the campaign is different from dropping out of the race. + Bernie Sanders remains on the ballot, and indeed in his campaign + suspension speech he encouraged people to continue voting for him in the + democratic primaries to push for changes in the convention. +- 2019-09-30 - *[[file:microposts/defense-stallman.org][defense-stallman]]* + + Someone wrote a bold article titled + [[https://geoff.greer.fm/2019/09/30/in-defense-of-richard-stallman/]["In + Defense of Richard Stallman"]]. Kudos to him. + + Also, an interesting read: + [[https://cfenollosa.com/blog/famous-computer-public-figure-suffers-the-consequences-for-asshole-ish-behavior.html][Famous + public figure in tech suffers the consequences for asshole-ish + behavior]]. +- 2019-09-29 - *[[file:microposts/stallman-resign.org][stallman-resign]]* + + Last week Richard Stallman resigned from FSF. It is a great loss for the + free software movement. + + The apparent cause of his resignation and the events that triggered it + reflect some alarming trends of the zeitgeist. Here is a detailed review + of what happened: [[https://sterling-archermedes.github.io/][Low grade + "journalists" and internet mob attack RMS with lies. In-depth review.]]. + Some interesting articles on this are: + [[https://jackbaruth.com/?p=16779][Weekly Roundup: The Passion Of Saint + iGNUcius Edition]], + [[http://techrights.org/2019/09/17/rms-witch-hunt/][Why I Once Called + for Richard Stallman to Step Down]]. + + Dishonest and misleading media pieces involved in this incident include + [[https://www.thedailybeast.com/famed-mit-computer-scientist-richard-stallman-defends-epstein-victims-were-entirely-willing][The + Daily Beast]], + [[https://www.vice.com/en_us/article/9ke3ke/famed-computer-scientist-richard-stallman-described-epstein-victims-as-entirely-willing][Vice]], + [[https://techcrunch.com/2019/09/16/computer-scientist-richard-stallman-who-defended-jeffrey-epstein-resigns-from-mit-csail-and-the-free-software-foundation/][Tech + Crunch]], + [[https://www.wired.com/story/richard-stallmans-exit-heralds-a-new-era-in-tech/][Wired]]. +- 2019-03-16 - *[[file:microposts/decss-haiku.org][decss-haiku]]* + + #+begin_quote + #+begin_example + Muse! When we learned to + count, little did we know all + the things we could do + + some day by shuffling + those numbers: Pythagoras + said "All is number" + + long before he saw + computers and their effects, + or what they could do + + by computation, + naive and mechanical + fast arithmetic. + + It changed the world, it + changed our consciousness and lives + to have such fast math + + available to + us and anyone who cared + to learn programming. + + Now help me, Muse, for + I wish to tell a piece of + controversial math, + + for which the lawyers + of DVD CCA + don't forbear to sue: + + that they alone should + know or have the right to teach + these skills and these rules. + + (Do they understand + the content, or is it just + the effects they see?) + + And all mathematics + is full of stories (just read + Eric Temple Bell); + + and CSS is + no exception to this rule. + Sing, Muse, decryption + + once secret, as all + knowledge, once unknown: how to + decrypt DVDs. + #+end_example + #+end_quote + + Seth Schoen, [[https://en.wikipedia.org/wiki/DeCSS_haiku][DeCSS haiku]] +- 2019-01-27 - *[[file:microposts/learning-undecidable.org][learning-undecidable]]* + + My take on the + [[https://www.nature.com/articles/s42256-018-0002-3][Nature paper + /Learning can be undecidable/]]: + + Fantastic article, very clearly written. + + So it reduces a kind of learninability called estimating the maximum + (EMX) to the cardinality of real numbers which is undecidable. + + When it comes to the relation between EMX and the rest of machine + learning framework, the article mentions that EMX belongs to "extensions + of PAC learnability include Vapnik's statistical learning setting and + the equivalent general learning setting by Shalev-Shwartz and + colleagues" (I have no idea what these two things are), but it does not + say whether EMX is representative of or reduces to common learning + tasks. So it is not clear whether its undecidability applies to ML at + large. + + Another condition to the main theorem is the union bounded closure + assumption. It seems a reasonable property of a family of sets, but then + again I wonder how that translates to learning. + + The article says "By now, we know of quite a few independence [from + mathematical axioms] results, mostly for set theoretic questions like + the continuum hypothesis, but also for results in algebra, analysis, + infinite combinatorics and more. Machine learning, so far, has escaped + this fate." but the description of the EMX learnability makes it more + like a classical mathematical / theoretical computer science problem + rather than machine learning. + + An insightful conclusion: "How come learnability can neither be proved + nor refuted? A closer look reveals that the source of the problem is in + defining learnability as the existence of a learning function rather + than the existence of a learning algorithm. In contrast with the + existence of algorithms, the existence of functions over infinite + domains is a (logically) subtle issue." + + In relation to practical problems, it uses an example of ad targeting. + However, A lot is lost in translation from the main theorem to this ad + example. + + The EMX problem states: given a domain X, a distribution P over X which + is unknown, some samples from P, and a family of subsets of X called F, + find A in F that approximately maximises P(A). + + The undecidability rests on X being the continuous [0, 1] interval, and + from the insight, we know the problem comes from the cardinality of + subsets of the [0, 1] interval, which is "logically subtle". + + In the ad problem, the domain X is all potential visitors, which is + finite because there are finite number of people in the world. In this + case P is a categorical distribution over the 1..n where n is the + population of the world. One can have a good estimate of the parameters + of a categorical distribution by asking for sufficiently large number of + samples and computing the empirical distribution. Let's call the + estimated distribution Q. One can choose the from F (also finite) the + set that maximises Q(A) which will be a solution to EMX. + + In other words, the theorem states: EMX is undecidable because not all + EMX instances are decidable, because there are some nasty ones due to + infinities. That does not mean no EMX instance is decidable. And I think + the ad instance is decidable. Is there a learning task that actually + corresponds to an undecidable EMX instance? I don't know, but I will not + believe the result of this paper is useful until I see one. + + h/t Reynaldo Boulogne +- 2018-12-11 - *[[file:microposts/gavin-belson.org][gavin-belson]]* + + #+begin_quote + I don't know about you people, but I don't want to live in a world + where someone else makes the world a better place better than we do. + #+end_quote + + Gavin Belson, Silicon Valley S2E1. + + I came across this quote in + [[https://slate.com/business/2018/12/facebook-emails-lawsuit-embarrassing-mark-zuckerberg.html][a + Slate post about Facebook]] +- 2018-10-05 - *[[file:microposts/margins.org][margins]]* + + With Fermat's Library's new tool + [[https://fermatslibrary.com/margins][margins]], you can host your own + journal club. +- 2018-09-18 - *[[file:microposts/rnn-turing.org][rnn-turing]]* + + Just some non-rigorous guess / thought: Feedforward networks are like + combinatorial logic, and recurrent networks are like sequential logic + (e.g. data flip-flop is like the feedback connection in RNN). Since NAND + - combinatorial logic + sequential logic = von Neumann machine which is + an approximation of the Turing machine, it is not surprising that RNN + (with feedforward networks) is Turing complete (assuming that neural + networks can learn the NAND gate). +- 2018-09-07 - *[[file:microposts/zitierkartell.org][zitierkartell]]* + + [[https://academia.stackexchange.com/questions/116489/counter-strategy-against-group-that-repeatedly-does-strategic-self-citations-and][Counter + strategy against group that repeatedly does strategic self-citations and + ignores other relevant research]] +- 2018-09-05 - *[[file:microposts/short-science.org][short-science]]* + + #+begin_quote + + + - ShortScience.org is a platform for post-publication discussion + aiming to improve accessibility and reproducibility of research + ideas. + - The website has over 800 summaries, mostly in machine learning, + written by the community and organized by paper, conference, and + year. + - Reading summaries of papers is useful to obtain the perspective and + insight of another reader, why they liked or disliked it, and their + attempt to demystify complicated sections. + - Also, writing summaries is a good exercise to understand the content + of a paper because you are forced to challenge your assumptions when + explaining it. + - Finally, you can keep up to date with the flood of research by + reading the latest summaries on our Twitter and Facebook pages. + #+end_quote + + [[https://shortscience.org][ShortScience.org]] +- 2018-08-13 - *[[file:microposts/darknet-diaries.org][darknet-diaries]]* + + [[https://darknetdiaries.com][Darknet Diaries]] is a cool podcast. + According to its about page it covers "true stories from the dark side + of the Internet. Stories about hackers, defenders, threats, malware, + botnets, breaches, and privacy." +- 2018-06-20 - *[[file:microposts/coursera-basic-income.org][coursera-basic-income]]* + + Coursera is having + [[https://www.coursera.org/learn/exploring-basic-income-in-a-changing-economy][a + Teach-Out on Basic Income]]. +- 2018-06-19 - *[[file:microposts/pun-generator.org][pun-generator]]* + + [[https://en.wikipedia.org/wiki/Computational_humor#Pun_generation][Pun + generators exist]]. +- 2018-06-15 - *[[file:microposts/hackers-excerpt.org][hackers-excerpt]]* + + #+begin_quote + But as more nontechnical people bought computers, the things that + impressed hackers were not as essential. While the programs themselves + had to maintain a certain standard of quality, it was quite possible + that the most exacting standards---those applied by a hacker who + wanted to add one more feature, or wouldn't let go of a project until + it was demonstrably faster than anything else around---were probably + counterproductive. What seemed more important was marketing. There + were plenty of brilliant programs which no one knew about. Sometimes + hackers would write programs and put them in the public domain, give + them away as easily as John Harris had lent his early copy of + Jawbreaker to the guys at the Fresno computer store. But rarely would + people ask for public domain programs by name: they wanted the ones + they saw advertised and discussed in magazines, demonstrated in + computer stores. It was not so important to have amazingly clever + algorithms. Users would put up with more commonplace ones. + + The Hacker Ethic, of course, held that every program should be as good + as you could make it (or better), infinitely flexible, admired for its + brilliance of concept and execution, and designed to extend the user's + powers. Selling computer programs like toothpaste was heresy. But it + was happening. Consider the prescription for success offered by one of + a panel of high-tech venture capitalists, gathered at a 1982 software + show: "I can summarize what it takes in three words: marketing, + marketing, marketing." When computers are sold like toasters, programs + will be sold like toothpaste. The Hacker Ethic notwithstanding. + #+end_quote + + [[http://www.stevenlevy.com/index.php/books/hackers][Hackers: Heroes of + Computer Revolution]], by Steven Levy. +- 2018-06-11 - *[[file:microposts/catalan-overflow.org][catalan-overflow]]* + + To compute Catalan numbers without unnecessary overflow, use the + recurrence formula \(C_n = {4 n - 2 \over n + 1} C_{n - 1}\). +- 2018-06-04 - *[[file:microposts/boyer-moore.org][boyer-moore]]* + + The + [[https://en.wikipedia.org/wiki/Boyer–Moore_majority_vote_algorithm][Boyer-Moore + algorithm for finding the majority of a sequence of elements]] falls in + the category of "very clever algorithms". + + #+begin_example + int majorityElement(vector<int>& xs) { + int count = 0; + int maj = xs[0]; + for (auto x : xs) { + if (x == maj) count++; + else if (count == 0) maj = x; + else count--; + } + return maj; + } + #+end_example +- 2018-05-30 - *[[file:microposts/how-to-learn-on-your-own.org][how-to-learn-on-your-own]]* + + Roger Grosse's post + [[https://metacademy.org/roadmaps/rgrosse/learn_on_your_own][How to + learn on your own (2015)]] is an excellent modern guide on how to learn + and research technical stuff (especially machine learning and maths) on + one's own. +- 2018-05-25 - *[[file:microposts/2048-mdp.org][2048-mdp]]* + + [[http://jdlm.info/articles/2018/03/18/markov-decision-process-2048.html][This + post]] models 2048 as an MDP and solves it using policy iteration and + backward induction. +- 2018-05-22 - *[[file:microposts/ats.org][ats]]* + + #+begin_quote + ATS (Applied Type System) is a programming language designed to unify + programming with formal specification. ATS has support for combining + theorem proving with practical programming through the use of advanced + type systems. A past version of The Computer Language Benchmarks Game + has demonstrated that the performance of ATS is comparable to that of + the C and C++ programming languages. By using theorem proving and + strict type checking, the compiler can detect and prove that its + implemented functions are not susceptible to bugs such as division by + zero, memory leaks, buffer overflow, and other forms of memory + corruption by verifying pointer arithmetic and reference counting + before the program compiles. Additionally, by using the integrated + theorem-proving system of ATS (ATS/LF), the programmer may make use of + static constructs that are intertwined with the operative code to + prove that a function attains its specification. + #+end_quote + + [[https://en.wikipedia.org/wiki/ATS_(programming_language)][Wikipedia + entry on ATS]] +- 2018-05-20 - *[[file:microposts/bostoncalling.org][bostoncalling]]* + + (5-second fame) I sent a picture of my kitchen sink to BBC and got + mentioned in the [[https://www.bbc.co.uk/programmes/w3cswg8c][latest + Boston Calling episode]] (listen at 25:54). +- 2018-05-18 - *[[file:microposts/colah-blog.org][colah-blog]]* + + [[https://colah.github.io/][colah's blog]] has a cool feature that + allows you to comment on any paragraph of a blog post. Here's an + [[https://colah.github.io/posts/2015-08-Understanding-LSTMs/][example]]. + If it is doable on a static site hosted on Github pages, I suppose it + shouldn't be too hard to implement. This also seems to work more + seamlessly than [[https://fermatslibrary.com/][Fermat's Library]], + because the latter has to embed pdfs in webpages. Now fantasy time: + imagine that one day arXiv shows html versions of papers (through author + uploading or conversion from TeX) with this feature. +- 2018-05-15 - *[[file:microposts/random-forests.org][random-forests]]* + + [[https://lagunita.stanford.edu/courses/HumanitiesSciences/StatLearning/Winter2016/info][Stanford + Lagunita's statistical learning course]] has some excellent lectures on + random forests. It starts with explanations of decision trees, followed + by bagged trees and random forests, and ends with boosting. From these + lectures it seems that: + + 1. The term "predictors" in statistical learning = "features" in machine + learning. + 1. The main idea of random forests of dropping predictors for individual + trees and aggregate by majority or average is the same as the idea of + dropout in neural networks, where a proportion of neurons in the + hidden layers are dropped temporarily during different minibatches of + training, effectively averaging over an emsemble of subnetworks. Both + tricks are used as regularisations, i.e. to reduce the variance. The + only difference is: in random forests, all but a square root number + of the total number of features are dropped, whereas the dropout + ratio in neural networks is usually a half. + + By the way, here's a comparison between statistical learning and machine + learning from the slides of the Statistcal Learning course: +- 2018-05-14 - *[[file:microposts/open-review-net.org][open-review-net]]* + + Open peer review means peer review process where communications + e.g. comments and responses are public. + + Like [[https://scipost.org/][SciPost]] mentioned in + [[file:/posts/2018-04-10-update-open-research.html][my post]], + [[https://openreview.net][OpenReview.net]] is an example of open peer + review in research. It looks like their focus is machine learning. Their + [[https://openreview.net/about][about page]] states their mission, and + here's [[https://openreview.net/group?id=ICLR.cc/2018/Conference][an + example]] where you can click on each entry to see what it is like. We + definitely need this in the maths research community. +- 2018-05-11 - *[[file:microposts/rnn-fsm.org][rnn-fsm]]* + + Related to [[file:neural-turing-machine][a previous micropost]]. + + [[http://www.cs.toronto.edu/~rgrosse/csc321/lec9.pdf][These slides from + Toronto]] are a nice introduction to RNN (recurrent neural network) from + a computational point of view. It states that RNN can simulate any FSM + (finite state machine, a.k.a. finite automata abbr. FA) with a toy + example computing the parity of a binary string. + + [[http://www.deeplearningbook.org/contents/rnn.html][Goodfellow et. + al.'s book]] (see page 372 and 374) goes one step further, stating that + RNN with a hidden-to-hidden layer can simulate Turing machines, and not + only that, but also the /universal/ Turing machine abbr. UTM (the book + referenced + [[https://www.sciencedirect.com/science/article/pii/S0022000085710136][Siegelmann-Sontag]]), + a property not shared by the weaker network where the hidden-to-hidden + layer is replaced by an output-to-hidden layer (page 376). + + By the way, the RNN with a hidden-to-hidden layer has the same + architecture as the so-called linear dynamical system mentioned in + [[https://www.coursera.org/learn/neural-networks/lecture/Fpa7y/modeling-sequences-a-brief-overview][Hinton's + video]]. + + From what I have learned, the universality of RNN and feedforward + networks are therefore due to different arguments, the former coming + from Turing machines and the latter from an analytical view of + approximation by step functions. +- 2018-05-10 - *[[file:microposts/math-writing-decoupling.org][math-writing-decoupling]]* + + One way to write readable mathematics is to decouple concepts. One idea + is the following template. First write a toy example with all the + important components present in this example, then analyse each + component individually and elaborate how (perhaps more complex) + variations of the component can extend the toy example and induce more + complex or powerful versions of the toy example. Through such + incremental development, one should be able to arrive at any result in + cutting edge research after a pleasant journey. + + It's a bit like the UNIX philosophy, where you have a basic system of + modules like IO, memory management, graphics etc, and modify / improve + each module individually (H/t [[http://nand2tetris.org/][NAND2Tetris]]). + + The book [[http://neuralnetworksanddeeplearning.com/][Neutral networks + and deep learning]] by Michael Nielsen is an example of such approach. + It begins the journey with a very simple neutral net with one hidden + layer, no regularisation, and sigmoid activations. It then analyses each + component including cost functions, the back propagation algorithm, the + activation functions, regularisation and the overall architecture (from + fully connected to CNN) individually and improve the toy example + incrementally. Over the course the accuracy of the example of mnist + grows incrementally from 95.42% to 99.67%. +- 2018-05-09 - *[[file:microposts/neural-nets-activation.org][neural-nets-activation]]* + + #+begin_quote + What makes the rectified linear activation function better than the + sigmoid or tanh functions? At present, we have a poor understanding of + the answer to this question. Indeed, rectified linear units have only + begun to be widely used in the past few years. The reason for that + recent adoption is empirical: a few people tried rectified linear + units, often on the basis of hunches or heuristic arguments. They got + good results classifying benchmark data sets, and the practice has + spread. In an ideal world we'd have a theory telling us which + activation function to pick for which application. But at present + we're a long way from such a world. I should not be at all surprised + if further major improvements can be obtained by an even better choice + of activation function. And I also expect that in coming decades a + powerful theory of activation functions will be developed. Today, we + still have to rely on poorly understood rules of thumb and experience. + #+end_quote + + Michael Nielsen, + [[http://neuralnetworksanddeeplearning.com/chap6.html#convolutional_neural_networks_in_practice][Neutral + networks and deep learning]] +- 2018-05-09 - *[[file:microposts/neural-turing-machine.org][neural-turing-machine]]* + + #+begin_quote + One way RNNs are currently being used is to connect neural networks + more closely to traditional ways of thinking about algorithms, ways of + thinking based on concepts such as Turing machines and (conventional) + programming languages. [[https://arxiv.org/abs/1410.4615][A 2014 + paper]] developed an RNN which could take as input a + character-by-character description of a (very, very simple!) Python + program, and use that description to predict the output. Informally, + the network is learning to "understand" certain Python programs. + [[https://arxiv.org/abs/1410.5401][A second paper, also from 2014]], + used RNNs as a starting point to develop what they called a neural + Turing machine (NTM). This is a universal computer whose entire + structure can be trained using gradient descent. They trained their + NTM to infer algorithms for several simple problems, such as sorting + and copying. + + As it stands, these are extremely simple toy models. Learning to + execute the Python program =print(398345+42598)= doesn't make a + network into a full-fledged Python interpreter! It's not clear how + much further it will be possible to push the ideas. Still, the results + are intriguing. Historically, neural networks have done well at + pattern recognition problems where conventional algorithmic approaches + have trouble. Vice versa, conventional algorithmic approaches are good + at solving problems that neural nets aren't so good at. No-one today + implements a web server or a database program using a neural network! + It'd be great to develop unified models that integrate the strengths + of both neural networks and more traditional approaches to algorithms. + RNNs and ideas inspired by RNNs may help us do that. + #+end_quote + + Michael Nielsen, + [[http://neuralnetworksanddeeplearning.com/chap6.html#other_approaches_to_deep_neural_nets][Neural + networks and deep learning]] +- 2018-05-08 - *[[file:microposts/nlp-arxiv.org][nlp-arxiv]]* + + Primer Science is a tool by a startup called Primer that uses NLP to + summarize contents (but not single papers, yet) on arxiv. A developer of + this tool predicts in + [[https://twimlai.com/twiml-talk-136-taming-arxiv-w-natural-language-processing-with-john-bohannon/#][an + interview]] that progress on AI's ability to extract meanings from AI + research papers will be the biggest accelerant on AI research. +- 2018-05-08 - *[[file:microposts/neural-nets-regularization.org][neural-nets-regularization]]* + + #+begin_quote + no-one has yet developed an entirely convincing theoretical + explanation for why regularization helps networks generalize. Indeed, + researchers continue to write papers where they try different + approaches to regularization, compare them to see which works better, + and attempt to understand why different approaches work better or + worse. And so you can view regularization as something of a kludge. + While it often helps, we don't have an entirely satisfactory + systematic understanding of what's going on, merely incomplete + heuristics and rules of thumb. + + There's a deeper set of issues here, issues which go to the heart of + science. It's the question of how we generalize. Regularization may + give us a computational magic wand that helps our networks generalize + better, but it doesn't give us a principled understanding of how + generalization works, nor of what the best approach is. + #+end_quote + + Michael Nielsen, + [[http://neuralnetworksanddeeplearning.com/chap3.html#why_does_regularization_help_reduce_overfitting][Neural + networks and deep learning]] +- 2018-05-08 - *[[file:microposts/sql-injection-video.org][sql-injection-video]]* + + Computerphile has some brilliant educational videos on computer science, + like [[https://www.youtube.com/watch?v=ciNHn38EyRc][a demo of SQL + injection]], [[https://www.youtube.com/watch?v=eis11j_iGMs][a toy + example of the lambda calculus]], and + [[https://www.youtube.com/watch?v=9T8A89jgeTI][explaining the Y + combinator]]. +- 2018-05-07 - *[[file:microposts/learning-knowledge-graph-reddit-journal-club.org][learning-knowledge-graph-reddit-journal-club]]* + + It is a natural idea to look for ways to learn things like going through + a skill tree in a computer RPG. + + For example I made a + [[https://ypei.me/posts/2015-04-02-juggling-skill-tree.html][DAG for + juggling]]. + + Websites like [[https://knowen.org][Knowen]] and + [[https://metacademy.org][Metacademy]] explore this idea with added + flavour of open collaboration. + + The design of Metacademy looks quite promising. It also has a nice + tagline: "your package manager for knowledge". + + There are so so many tools to assist learning / research / knowledge + sharing today, and we should keep experimenting, in the hope that + eventually one of them will scale. + + On another note, I often complain about the lack of a place to discuss + math research online, but today I found on Reddit some journal clubs on + machine learning: + [[https://www.reddit.com/r/MachineLearning/comments/8aluhs/d_machine_learning_wayr_what_are_you_reading_week/][1]], + [[https://www.reddit.com/r/MachineLearning/comments/8elmd8/d_anyone_having_trouble_reading_a_particular/][2]]. + If only we had this for maths. On the other hand r/math does have some + interesting recurring threads as well: + [[https://www.reddit.com/r/math/wiki/everythingaboutx][Everything about + X]] and + [[https://www.reddit.com/r/math/search?q=what+are+you+working+on?+author:automoderator+&sort=new&restrict_sr=on&t=all][What + Are You Working On?]]. Hopefully these threads can last for years to + come. +- 2018-05-02 - *[[file:microposts/simple-solution-lack-of-math-rendering.org][simple-solution-lack-of-math-rendering]]* + + The lack of maths rendering in major online communication platforms like + instant messaging, email or Github has been a minor obsession of mine + for quite a while, as I saw it as a big factor preventing people from + talking more maths online. But today I realised this is totally a + non-issue. Just do what people on IRC have been doing since the + inception of the universe: use a (latex) pastebin. +- 2018-05-01 - *[[file:microposts/neural-networks-programming-paradigm.org][neural-networks-programming-paradigm]]* + + #+begin_quote + Neural networks are one of the most beautiful programming paradigms + ever invented. In the conventional approach to programming, we tell + the computer what to do, breaking big problems up into many small, + precisely defined tasks that the computer can easily perform. By + contrast, in a neural network we don't tell the computer how to solve + our problem. Instead, it learns from observational data, figuring out + its own solution to the problem at hand. + #+end_quote + + Michael Nielsen - + [[http://neuralnetworksanddeeplearning.com/about.html][What this book + (Neural Networks and Deep Learning) is about]] + + Unrelated to the quote, note that Nielsen's book is licensed under + [[https://creativecommons.org/licenses/by-nc/3.0/deed.en_GB][CC BY-NC]], + so one can build on it and redistribute non-commercially. +- 2018-04-30 - *[[file:microposts/google-search-not-ai.org][google-search-not-ai]]* + + #+begin_quote + But, users have learned to accommodate to Google not the other way + around. We know what kinds of things we can type into Google and what + we can't and we keep our searches to things that Google is likely to + help with. We know we are looking for texts and not answers to start a + conversation with an entity that knows what we really need to talk + about. People learn from conversation and Google can't have one. It + can pretend to have one using Siri but really those conversations tend + to get tiresome when you are past asking about where to eat. + #+end_quote + + Roger Schank - + [[http://www.rogerschank.com/fraudulent-claims-made-by-IBM-about-Watson-and-AI][Fraudulent + claims made by IBM about Watson and AI]] +- 2018-04-06 - *[[file:microposts/hacker-ethics.org][hacker-ethics]]* + + #+begin_quote + + + - Access to computers---and anything that might teach you something + about the way the world works---should be unlimited and total. + Always yield to the Hands-On Imperative! + - All information should be free. + - Mistrust Authority---Promote Decentralization. + - Hackers should be judged by their hacking, not bogus criteria such + as degrees, age, race, or position. + - You can create art and beauty on a computer. + - Computers can change your life for the better. + #+end_quote + + [[https://en.wikipedia.org/wiki/Hacker_ethic][The Hacker Ethic]], + [[https://en.wikipedia.org/wiki/Hackers:_Heroes_of_the_Computer_Revolution][Hackers: + Heroes of Computer Revolution]], by Steven Levy +- 2018-03-23 - *[[file:microposts/static-site-generator.org][static-site-generator]]* + + #+begin_quote + "Static site generators seem like music databases, in that everyone + eventually writes their own crappy one that just barely scratches the + itch they had (and I'm no exception)." + #+end_quote + + __david__@hackernews + + So did I.
\ No newline at end of file |