pages/microblog.org


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723

#+TITLE: Yuchen's Microblog

- *[[ia-lawsuit][2020-08-02]]* - ia lawsuit
  <<ia-lawsuit>>
  
  The four big publishers Hachette, HarperCollins, Wiley, and Penguin
  Random House are still pursuing Internet Archive.
  
  #+begin_quote
  [Their] lawsuit does not stop at seeking to end the practice of
  Controlled Digital Lending. These publishers call for the destruction
  of the 1.5 million digital books that Internet Archive makes available
  to our patrons. This form of digital book burning is unprecedented and
  unfairly disadvantages people with print disabilities. For the blind,
  ebooks are a lifeline, yet less than one in ten exists in accessible
  formats. Since 2010, Internet Archive has made our lending library
  available to the blind and print disabled community, in addition to
  sighted users. If the publishers are successful with their lawsuit,
  more than a million of those books would be deleted from the
  Internet's digital shelves forever.
  #+end_quote
  
  [[https://blog.archive.org/2020/07/29/internet-archive-responds-to-publishers-lawsuit/][Libraries
  lend books, and must continue to lend books: Internet Archive responds
  to publishers' lawsuit]]
- *[[fsf-membership][2020-08-02]]* - fsf-membership
  <<fsf-membership>>
  
  I am a proud associate member of Free Software Freedom. For me the
  philosophy of Free Software is about ensuring the enrichment of a
  digital commons, so that knowledge and information are not concentrated
  in the hands of selected privileged people and locked up as
  "intellectual property". The genius of copyleft licenses like GNU (A)GPL
  ensures software released for the public, remains public. Open source
  does not care about that.
  
  If you also care about the public good, the hacker ethics, or the spirit
  of the web, please take a moment to consider joining FSF as an associate
  member. It comes with [[https://www.fsf.org/associate/benefits][numerous
  perks and benefits]].
- *[[how-can-you-help-ia][2020-06-21]]* - how-can-you-help-ia
  <<how-can-you-help-ia>>
  
  [[https://blog.archive.org/2020/06/14/how-can-you-help-the-internet-archive/][How
  can you help the Internet Archive?]] Use it. It's more than the Wayback
  Machine. And get involved.
- *[[open-library][2020-06-12]]* - open-library
  <<open-library>>
  
  Open Library was cofounded by Aaron Swartz. As part of the Internet
  Archive, it has done good work to spread knowledge. However it is
  currently
  [[https://arstechnica.com/tech-policy/2020/06/internet-archive-ends-emergency-library-early-to-appease-publishers/][being
  sued by four major publishers]] for the
  [[https://archive.org/details/nationalemergencylibrary][National
  Emergency Library]]. IA decided to
  [[https://blog.archive.org/2020/06/10/temporary-national-emergency-library-to-close-2-weeks-early-returning-to-traditional-controlled-digital-lending/][close
  the NEL two weeks earlier than planned]], but the lawsuit is not over,
  which in the worst case scenario has the danger of resulting in
  Controlled Digital Lending being considered illegal and (less likely)
  bancruptcy of the Internet Archive. If this happens it will be a big
  setback of the free-culture movement.
- *[[sanders-suspend-campaign][2020-04-15]]* - sanders-suspend-campaign
  <<sanders-suspend-campaign>>
  
  Suspending the campaign is different from dropping out of the race.
  Bernie Sanders remains on the ballot, and indeed in his campaign
  suspension speech he encouraged people to continue voting for him in the
  democratic primaries to push for changes in the convention.
- *[[defense-stallman][2019-09-30]]* - defense-stallman
  <<defense-stallman>>
  
  Someone wrote a bold article titled
  [[https://geoff.greer.fm/2019/09/30/in-defense-of-richard-stallman/]["In
  Defense of Richard Stallman"]]. Kudos to him.
  
  Also, an interesting read:
  [[https://cfenollosa.com/blog/famous-computer-public-figure-suffers-the-consequences-for-asshole-ish-behavior.html][Famous
  public figure in tech suffers the consequences for asshole-ish
  behavior]].
- *[[stallman-resign][2019-09-29]]* - stallman-resign
  <<stallman-resign>>
  
  Last week Richard Stallman resigned from FSF. It is a great loss for the
  free software movement.
  
  The apparent cause of his resignation and the events that triggered it
  reflect some alarming trends of the zeitgeist. Here is a detailed review
  of what happened: [[https://sterling-archermedes.github.io/][Low grade
  "journalists" and internet mob attack RMS with lies. In-depth review.]].
  Some interesting articles on this are:
  [[https://jackbaruth.com/?p=16779][Weekly Roundup: The Passion Of Saint
  iGNUcius Edition]],
  [[http://techrights.org/2019/09/17/rms-witch-hunt/][Why I Once Called
  for Richard Stallman to Step Down]].
  
  Dishonest and misleading media pieces involved in this incident include
  [[https://www.thedailybeast.com/famed-mit-computer-scientist-richard-stallman-defends-epstein-victims-were-entirely-willing][The
  Daily Beast]],
  [[https://www.vice.com/en_us/article/9ke3ke/famed-computer-scientist-richard-stallman-described-epstein-victims-as-entirely-willing][Vice]],
  [[https://techcrunch.com/2019/09/16/computer-scientist-richard-stallman-who-defended-jeffrey-epstein-resigns-from-mit-csail-and-the-free-software-foundation/][Tech
  Crunch]],
  [[https://www.wired.com/story/richard-stallmans-exit-heralds-a-new-era-in-tech/][Wired]].
- *[[decss-haiku][2019-03-16]]* - decss-haiku
  <<decss-haiku>>
  
  #+begin_quote
  #+begin_example
    Muse!  When we learned to
    count, little did we know all
    the things we could do
  
    some day by shuffling
    those numbers: Pythagoras
    said "All is number"
  
    long before he saw
    computers and their effects,
    or what they could do
  
    by computation,
    naive and mechanical
    fast arithmetic.
  
    It changed the world, it
    changed our consciousness and lives
    to have such fast math
  
    available to
    us and anyone who cared
    to learn programming.
  
    Now help me, Muse, for
    I wish to tell a piece of
    controversial math,
  
    for which the lawyers
    of DVD CCA
    don't forbear to sue:
  
    that they alone should
    know or have the right to teach
    these skills and these rules.
  
    (Do they understand
    the content, or is it just
    the effects they see?)
  
    And all mathematics
    is full of stories (just read
    Eric Temple Bell);
  
    and CSS is
    no exception to this rule.
    Sing, Muse, decryption
  
    once secret, as all
    knowledge, once unknown: how to
    decrypt DVDs.
  #+end_example
  #+end_quote
  
  Seth Schoen, [[https://en.wikipedia.org/wiki/DeCSS_haiku][DeCSS haiku]]
- *[[learning-undecidable][2019-01-27]]* - learning-undecidable
  <<learning-undecidable>>
  
  My take on the
  [[https://www.nature.com/articles/s42256-018-0002-3][Nature paper
  /Learning can be undecidable/]]:
  
  Fantastic article, very clearly written.
  
  So it reduces a kind of learninability called estimating the maximum
  (EMX) to the cardinality of real numbers which is undecidable.
  
  When it comes to the relation between EMX and the rest of machine
  learning framework, the article mentions that EMX belongs to "extensions
  of PAC learnability include Vapnik's statistical learning setting and
  the equivalent general learning setting by Shalev-Shwartz and
  colleagues" (I have no idea what these two things are), but it does not
  say whether EMX is representative of or reduces to common learning
  tasks. So it is not clear whether its undecidability applies to ML at
  large.
  
  Another condition to the main theorem is the union bounded closure
  assumption. It seems a reasonable property of a family of sets, but then
  again I wonder how that translates to learning.
  
  The article says "By now, we know of quite a few independence [from
  mathematical axioms] results, mostly for set theoretic questions like
  the continuum hypothesis, but also for results in algebra, analysis,
  infinite combinatorics and more. Machine learning, so far, has escaped
  this fate." but the description of the EMX learnability makes it more
  like a classical mathematical / theoretical computer science problem
  rather than machine learning.
  
  An insightful conclusion: "How come learnability can neither be proved
  nor refuted? A closer look reveals that the source of the problem is in
  defining learnability as the existence of a learning function rather
  than the existence of a learning algorithm. In contrast with the
  existence of algorithms, the existence of functions over infinite
  domains is a (logically) subtle issue."
  
  In relation to practical problems, it uses an example of ad targeting.
  However, A lot is lost in translation from the main theorem to this ad
  example.
  
  The EMX problem states: given a domain X, a distribution P over X which
  is unknown, some samples from P, and a family of subsets of X called F,
  find A in F that approximately maximises P(A).
  
  The undecidability rests on X being the continuous [0, 1] interval, and
  from the insight, we know the problem comes from the cardinality of
  subsets of the [0, 1] interval, which is "logically subtle".
  
  In the ad problem, the domain X is all potential visitors, which is
  finite because there are finite number of people in the world. In this
  case P is a categorical distribution over the 1..n where n is the
  population of the world. One can have a good estimate of the parameters
  of a categorical distribution by asking for sufficiently large number of
  samples and computing the empirical distribution. Let's call the
  estimated distribution Q. One can choose the from F (also finite) the
  set that maximises Q(A) which will be a solution to EMX.
  
  In other words, the theorem states: EMX is undecidable because not all
  EMX instances are decidable, because there are some nasty ones due to
  infinities. That does not mean no EMX instance is decidable. And I think
  the ad instance is decidable. Is there a learning task that actually
  corresponds to an undecidable EMX instance? I don't know, but I will not
  believe the result of this paper is useful until I see one.
  
  h/t Reynaldo Boulogne
- *[[gavin-belson][2018-12-11]]* - gavin-belson
  <<gavin-belson>>
  
  #+begin_quote
  I don't know about you people, but I don't want to live in a world
  where someone else makes the world a better place better than we do.
  #+end_quote
  
  Gavin Belson, Silicon Valley S2E1.
  
  I came across this quote in
  [[https://slate.com/business/2018/12/facebook-emails-lawsuit-embarrassing-mark-zuckerberg.html][a
  Slate post about Facebook]]
- *[[margins][2018-10-05]]* - margins
  <<margins>>
  
  With Fermat's Library's new tool
  [[https://fermatslibrary.com/margins][margins]], you can host your own
  journal club.
- *[[rnn-turing][2018-09-18]]* - rnn-turing
  <<rnn-turing>>
  
  Just some non-rigorous guess / thought: Feedforward networks are like
  combinatorial logic, and recurrent networks are like sequential logic
  (e.g. data flip-flop is like the feedback connection in RNN). Since NAND
  - combinatorial logic + sequential logic = von Neumann machine which is
  an approximation of the Turing machine, it is not surprising that RNN
  (with feedforward networks) is Turing complete (assuming that neural
  networks can learn the NAND gate).
- *[[zitierkartell][2018-09-07]]* - zitierkartell
  <<zitierkartell>>
  
  [[https://academia.stackexchange.com/questions/116489/counter-strategy-against-group-that-repeatedly-does-strategic-self-citations-and][Counter
  strategy against group that repeatedly does strategic self-citations and
  ignores other relevant research]]
- *[[short-science][2018-09-05]]* - short-science
  <<short-science>>
  
  #+begin_quote
  
  
  - ShortScience.org is a platform for post-publication discussion
    aiming to improve accessibility and reproducibility of research
    ideas.
  - The website has over 800 summaries, mostly in machine learning,
    written by the community and organized by paper, conference, and
    year.
  - Reading summaries of papers is useful to obtain the perspective and
    insight of another reader, why they liked or disliked it, and their
    attempt to demystify complicated sections.
  - Also, writing summaries is a good exercise to understand the content
    of a paper because you are forced to challenge your assumptions when
    explaining it.
  - Finally, you can keep up to date with the flood of research by
    reading the latest summaries on our Twitter and Facebook pages.
  #+end_quote
  
  [[https://shortscience.org][ShortScience.org]]
- *[[darknet-diaries][2018-08-13]]* - darknet-diaries
  <<darknet-diaries>>
  
  [[https://darknetdiaries.com][Darknet Diaries]] is a cool podcast.
  According to its about page it covers "true stories from the dark side
  of the Internet. Stories about hackers, defenders, threats, malware,
  botnets, breaches, and privacy."
- *[[coursera-basic-income][2018-06-20]]* - coursera-basic-income
  <<coursera-basic-income>>
  
  Coursera is having
  [[https://www.coursera.org/learn/exploring-basic-income-in-a-changing-economy][a
  Teach-Out on Basic Income]].
- *[[pun-generator][2018-06-19]]* - pun-generator
  <<pun-generator>>
  
  [[https://en.wikipedia.org/wiki/Computational_humor#Pun_generation][Pun
  generators exist]].
- *[[hackers-excerpt][2018-06-15]]* - hackers-excerpt
  <<hackers-excerpt>>
  
  #+begin_quote
  But as more nontechnical people bought computers, the things that
  impressed hackers were not as essential. While the programs themselves
  had to maintain a certain standard of quality, it was quite possible
  that the most exacting standards---those applied by a hacker who
  wanted to add one more feature, or wouldn't let go of a project until
  it was demonstrably faster than anything else around---were probably
  counterproductive. What seemed more important was marketing. There
  were plenty of brilliant programs which no one knew about. Sometimes
  hackers would write programs and put them in the public domain, give
  them away as easily as John Harris had lent his early copy of
  Jawbreaker to the guys at the Fresno computer store. But rarely would
  people ask for public domain programs by name: they wanted the ones
  they saw advertised and discussed in magazines, demonstrated in
  computer stores. It was not so important to have amazingly clever
  algorithms. Users would put up with more commonplace ones.
  
  The Hacker Ethic, of course, held that every program should be as good
  as you could make it (or better), infinitely flexible, admired for its
  brilliance of concept and execution, and designed to extend the user's
  powers. Selling computer programs like toothpaste was heresy. But it
  was happening. Consider the prescription for success offered by one of
  a panel of high-tech venture capitalists, gathered at a 1982 software
  show: "I can summarize what it takes in three words: marketing,
  marketing, marketing." When computers are sold like toasters, programs
  will be sold like toothpaste. The Hacker Ethic notwithstanding.
  #+end_quote
  
  [[http://www.stevenlevy.com/index.php/books/hackers][Hackers: Heroes of
  Computer Revolution]], by Steven Levy.
- *[[catalan-overflow][2018-06-11]]* - catalan-overflow
  <<catalan-overflow>>
  
  To compute Catalan numbers without unnecessary overflow, use the
  recurrence formula \(C_n = {4 n - 2 \over n + 1} C_{n - 1}\).
- *[[boyer-moore][2018-06-04]]* - boyer-moore
  <<boyer-moore>>
  
  The
  [[https://en.wikipedia.org/wiki/Boyer–Moore_majority_vote_algorithm][Boyer-Moore
  algorithm for finding the majority of a sequence of elements]] falls in
  the category of "very clever algorithms".
  
  #+begin_example
    int majorityElement(vector<int>& xs) {
        int count = 0;
        int maj = xs[0];
        for (auto x : xs) {
            if (x == maj) count++;
            else if (count == 0) maj = x;
            else count--;
        }
        return maj;
    }
  #+end_example
- *[[how-to-learn-on-your-own][2018-05-30]]* - how-to-learn-on-your-own
  <<how-to-learn-on-your-own>>
  
  Roger Grosse's post
  [[https://metacademy.org/roadmaps/rgrosse/learn_on_your_own][How to
  learn on your own (2015)]] is an excellent modern guide on how to learn
  and research technical stuff (especially machine learning and maths) on
  one's own.
- *[[2048-mdp][2018-05-25]]* - 2048-mdp
  <<2048-mdp>>
  
  [[http://jdlm.info/articles/2018/03/18/markov-decision-process-2048.html][This
  post]] models 2048 as an MDP and solves it using policy iteration and
  backward induction.
- *[[ats][2018-05-22]]* - ats
  <<ats>>
  
  #+begin_quote
  ATS (Applied Type System) is a programming language designed to unify
  programming with formal specification. ATS has support for combining
  theorem proving with practical programming through the use of advanced
  type systems. A past version of The Computer Language Benchmarks Game
  has demonstrated that the performance of ATS is comparable to that of
  the C and C++ programming languages. By using theorem proving and
  strict type checking, the compiler can detect and prove that its
  implemented functions are not susceptible to bugs such as division by
  zero, memory leaks, buffer overflow, and other forms of memory
  corruption by verifying pointer arithmetic and reference counting
  before the program compiles. Additionally, by using the integrated
  theorem-proving system of ATS (ATS/LF), the programmer may make use of
  static constructs that are intertwined with the operative code to
  prove that a function attains its specification.
  #+end_quote
  
  [[https://en.wikipedia.org/wiki/ATS_(programming_language)][Wikipedia
  entry on ATS]]
- *[[bostoncalling][2018-05-20]]* - bostoncalling
  <<bostoncalling>>
  
  (5-second fame) I sent a picture of my kitchen sink to BBC and got
  mentioned in the [[https://www.bbc.co.uk/programmes/w3cswg8c][latest
  Boston Calling episode]] (listen at 25:54).
- *[[colah-blog][2018-05-18]]* - colah-blog
  <<colah-blog>>
  
  [[https://colah.github.io/][colah's blog]] has a cool feature that
  allows you to comment on any paragraph of a blog post. Here's an
  [[https://colah.github.io/posts/2015-08-Understanding-LSTMs/][example]].
  If it is doable on a static site hosted on Github pages, I suppose it
  shouldn't be too hard to implement. This also seems to work more
  seamlessly than [[https://fermatslibrary.com/][Fermat's Library]],
  because the latter has to embed pdfs in webpages. Now fantasy time:
  imagine that one day arXiv shows html versions of papers (through author
  uploading or conversion from TeX) with this feature.
- *[[random-forests][2018-05-15]]* - random-forests
  <<random-forests>>
  
  [[https://lagunita.stanford.edu/courses/HumanitiesSciences/StatLearning/Winter2016/info][Stanford
  Lagunita's statistical learning course]] has some excellent lectures on
  random forests. It starts with explanations of decision trees, followed
  by bagged trees and random forests, and ends with boosting. From these
  lectures it seems that:
  
  1. The term "predictors" in statistical learning = "features" in machine
    learning.
  1. The main idea of random forests of dropping predictors for individual
    trees and aggregate by majority or average is the same as the idea of
    dropout in neural networks, where a proportion of neurons in the
    hidden layers are dropped temporarily during different minibatches of
    training, effectively averaging over an emsemble of subnetworks. Both
    tricks are used as regularisations, i.e. to reduce the variance. The
    only difference is: in random forests, all but a square root number
    of the total number of features are dropped, whereas the dropout
    ratio in neural networks is usually a half.
  
  By the way, here's a comparison between statistical learning and machine
  learning from the slides of the Statistcal Learning course:
- *[[open-review-net][2018-05-14]]* - open-review-net
  <<open-review-net>>
  
  Open peer review means peer review process where communications
  e.g. comments and responses are public.
  
  Like [[https://scipost.org/][SciPost]] mentioned in
  [[file:/posts/2018-04-10-update-open-research.html][my post]],
  [[https://openreview.net][OpenReview.net]] is an example of open peer
  review in research. It looks like their focus is machine learning. Their
  [[https://openreview.net/about][about page]] states their mission, and
  here's [[https://openreview.net/group?id=ICLR.cc/2018/Conference][an
  example]] where you can click on each entry to see what it is like. We
  definitely need this in the maths research community.
- *[[rnn-fsm][2018-05-11]]* - rnn-fsm
  <<rnn-fsm>>
  
  Related to [[neural-turing-machine][a previous micropost]].
  
  [[http://www.cs.toronto.edu/~rgrosse/csc321/lec9.pdf][These slides from
  Toronto]] are a nice introduction to RNN (recurrent neural network) from
  a computational point of view. It states that RNN can simulate any FSM
  (finite state machine, a.k.a. finite automata abbr. FA) with a toy
  example computing the parity of a binary string.
  
  [[http://www.deeplearningbook.org/contents/rnn.html][Goodfellow et.
  al.'s book]] (see page 372 and 374) goes one step further, stating that
  RNN with a hidden-to-hidden layer can simulate Turing machines, and not
  only that, but also the /universal/ Turing machine abbr. UTM (the book
  referenced
  [[https://www.sciencedirect.com/science/article/pii/S0022000085710136][Siegelmann-Sontag]]),
  a property not shared by the weaker network where the hidden-to-hidden
  layer is replaced by an output-to-hidden layer (page 376).
  
  By the way, the RNN with a hidden-to-hidden layer has the same
  architecture as the so-called linear dynamical system mentioned in
  [[https://www.coursera.org/learn/neural-networks/lecture/Fpa7y/modeling-sequences-a-brief-overview][Hinton's
  video]].
  
  From what I have learned, the universality of RNN and feedforward
  networks are therefore due to different arguments, the former coming
  from Turing machines and the latter from an analytical view of
  approximation by step functions.
- *[[math-writing-decoupling][2018-05-10]]* - math-writing-decoupling
  <<math-writing-decoupling>>
  
  One way to write readable mathematics is to decouple concepts. One idea
  is the following template. First write a toy example with all the
  important components present in this example, then analyse each
  component individually and elaborate how (perhaps more complex)
  variations of the component can extend the toy example and induce more
  complex or powerful versions of the toy example. Through such
  incremental development, one should be able to arrive at any result in
  cutting edge research after a pleasant journey.
  
  It's a bit like the UNIX philosophy, where you have a basic system of
  modules like IO, memory management, graphics etc, and modify / improve
  each module individually (H/t [[http://nand2tetris.org/][NAND2Tetris]]).
  
  The book [[http://neuralnetworksanddeeplearning.com/][Neutral networks
  and deep learning]] by Michael Nielsen is an example of such approach.
  It begins the journey with a very simple neutral net with one hidden
  layer, no regularisation, and sigmoid activations. It then analyses each
  component including cost functions, the back propagation algorithm, the
  activation functions, regularisation and the overall architecture (from
  fully connected to CNN) individually and improve the toy example
  incrementally. Over the course the accuracy of the example of mnist
  grows incrementally from 95.42% to 99.67%.
- *[[neural-nets-activation][2018-05-09]]* - neural-nets-activation
  <<neural-nets-activation>>
  
  #+begin_quote
  What makes the rectified linear activation function better than the
  sigmoid or tanh functions? At present, we have a poor understanding of
  the answer to this question. Indeed, rectified linear units have only
  begun to be widely used in the past few years. The reason for that
  recent adoption is empirical: a few people tried rectified linear
  units, often on the basis of hunches or heuristic arguments. They got
  good results classifying benchmark data sets, and the practice has
  spread. In an ideal world we'd have a theory telling us which
  activation function to pick for which application. But at present
  we're a long way from such a world. I should not be at all surprised
  if further major improvements can be obtained by an even better choice
  of activation function. And I also expect that in coming decades a
  powerful theory of activation functions will be developed. Today, we
  still have to rely on poorly understood rules of thumb and experience.
  #+end_quote
  
  Michael Nielsen,
  [[http://neuralnetworksanddeeplearning.com/chap6.html#convolutional_neural_networks_in_practice][Neutral
  networks and deep learning]]
- *[[neural-turing-machine][2018-05-09]]* - neural-turing-machine
  <<neural-turing-machine>>
  
  #+begin_quote
  One way RNNs are currently being used is to connect neural networks
  more closely to traditional ways of thinking about algorithms, ways of
  thinking based on concepts such as Turing machines and (conventional)
  programming languages. [[https://arxiv.org/abs/1410.4615][A 2014
  paper]] developed an RNN which could take as input a
  character-by-character description of a (very, very simple!) Python
  program, and use that description to predict the output. Informally,
  the network is learning to "understand" certain Python programs.
  [[https://arxiv.org/abs/1410.5401][A second paper, also from 2014]],
  used RNNs as a starting point to develop what they called a neural
  Turing machine (NTM). This is a universal computer whose entire
  structure can be trained using gradient descent. They trained their
  NTM to infer algorithms for several simple problems, such as sorting
  and copying.
  
  As it stands, these are extremely simple toy models. Learning to
  execute the Python program =print(398345+42598)= doesn't make a
  network into a full-fledged Python interpreter! It's not clear how
  much further it will be possible to push the ideas. Still, the results
  are intriguing. Historically, neural networks have done well at
  pattern recognition problems where conventional algorithmic approaches
  have trouble. Vice versa, conventional algorithmic approaches are good
  at solving problems that neural nets aren't so good at. No-one today
  implements a web server or a database program using a neural network!
  It'd be great to develop unified models that integrate the strengths
  of both neural networks and more traditional approaches to algorithms.
  RNNs and ideas inspired by RNNs may help us do that.
  #+end_quote
  
  Michael Nielsen,
  [[http://neuralnetworksanddeeplearning.com/chap6.html#other_approaches_to_deep_neural_nets][Neural
  networks and deep learning]]
- *[[nlp-arxiv][2018-05-08]]* - nlp-arxiv
  <<nlp-arxiv>>
  
  Primer Science is a tool by a startup called Primer that uses NLP to
  summarize contents (but not single papers, yet) on arxiv. A developer of
  this tool predicts in
  [[https://twimlai.com/twiml-talk-136-taming-arxiv-w-natural-language-processing-with-john-bohannon/#][an
  interview]] that progress on AI's ability to extract meanings from AI
  research papers will be the biggest accelerant on AI research.
- *[[neural-nets-regularization][2018-05-08]]* - neural-nets-regularization
  <<neural-nets-regularization>>
  
  #+begin_quote
  no-one has yet developed an entirely convincing theoretical
  explanation for why regularization helps networks generalize. Indeed,
  researchers continue to write papers where they try different
  approaches to regularization, compare them to see which works better,
  and attempt to understand why different approaches work better or
  worse. And so you can view regularization as something of a kludge.
  While it often helps, we don't have an entirely satisfactory
  systematic understanding of what's going on, merely incomplete
  heuristics and rules of thumb.
  
  There's a deeper set of issues here, issues which go to the heart of
  science. It's the question of how we generalize. Regularization may
  give us a computational magic wand that helps our networks generalize
  better, but it doesn't give us a principled understanding of how
  generalization works, nor of what the best approach is.
  #+end_quote
  
  Michael Nielsen,
  [[http://neuralnetworksanddeeplearning.com/chap3.html#why_does_regularization_help_reduce_overfitting][Neural
  networks and deep learning]]
- *[[sql-injection-video][2018-05-08]]* - sql-injection-video
  <<sql-injection-video>>
  
  Computerphile has some brilliant educational videos on computer science,
  like [[https://www.youtube.com/watch?v=ciNHn38EyRc][a demo of SQL
  injection]], [[https://www.youtube.com/watch?v=eis11j_iGMs][a toy
  example of the lambda calculus]], and
  [[https://www.youtube.com/watch?v=9T8A89jgeTI][explaining the Y
  combinator]].
- *[[learning-knowledge-graph-reddit-journal-club][2018-05-07]]* - learning-knowledge-graph-reddit-journal-club
  <<learning-knowledge-graph-reddit-journal-club>>
  
  It is a natural idea to look for ways to learn things like going through
  a skill tree in a computer RPG.
  
  For example I made a
  [[https://ypei.me/posts/2015-04-02-juggling-skill-tree.html][DAG for
  juggling]].
  
  Websites like [[https://knowen.org][Knowen]] and
  [[https://metacademy.org][Metacademy]] explore this idea with added
  flavour of open collaboration.
  
  The design of Metacademy looks quite promising. It also has a nice
  tagline: "your package manager for knowledge".
  
  There are so so many tools to assist learning / research / knowledge
  sharing today, and we should keep experimenting, in the hope that
  eventually one of them will scale.
  
  On another note, I often complain about the lack of a place to discuss
  math research online, but today I found on Reddit some journal clubs on
  machine learning:
  [[https://www.reddit.com/r/MachineLearning/comments/8aluhs/d_machine_learning_wayr_what_are_you_reading_week/][1]],
  [[https://www.reddit.com/r/MachineLearning/comments/8elmd8/d_anyone_having_trouble_reading_a_particular/][2]].
  If only we had this for maths. On the other hand r/math does have some
  interesting recurring threads as well:
  [[https://www.reddit.com/r/math/wiki/everythingaboutx][Everything about
  X]] and
  [[https://www.reddit.com/r/math/search?q=what+are+you+working+on?+author:automoderator+&sort=new&restrict_sr=on&t=all][What
  Are You Working On?]]. Hopefully these threads can last for years to
  come.
- *[[simple-solution-lack-of-math-rendering][2018-05-02]]* - simple-solution-lack-of-math-rendering
  <<simple-solution-lack-of-math-rendering>>
  
  The lack of maths rendering in major online communication platforms like
  instant messaging, email or Github has been a minor obsession of mine
  for quite a while, as I saw it as a big factor preventing people from
  talking more maths online. But today I realised this is totally a
  non-issue. Just do what people on IRC have been doing since the
  inception of the universe: use a (latex) pastebin.
- *[[neural-networks-programming-paradigm][2018-05-01]]* - neural-networks-programming-paradigm
  <<neural-networks-programming-paradigm>>
  
  #+begin_quote
  Neural networks are one of the most beautiful programming paradigms
  ever invented. In the conventional approach to programming, we tell
  the computer what to do, breaking big problems up into many small,
  precisely defined tasks that the computer can easily perform. By
  contrast, in a neural network we don't tell the computer how to solve
  our problem. Instead, it learns from observational data, figuring out
  its own solution to the problem at hand.
  #+end_quote
  
  Michael Nielsen -
  [[http://neuralnetworksanddeeplearning.com/about.html][What this book
  (Neural Networks and Deep Learning) is about]]
  
  Unrelated to the quote, note that Nielsen's book is licensed under
  [[https://creativecommons.org/licenses/by-nc/3.0/deed.en_GB][CC BY-NC]],
  so one can build on it and redistribute non-commercially.
- *[[google-search-not-ai][2018-04-30]]* - google-search-not-ai
  <<google-search-not-ai>>
  
  #+begin_quote
  But, users have learned to accommodate to Google not the other way
  around. We know what kinds of things we can type into Google and what
  we can't and we keep our searches to things that Google is likely to
  help with. We know we are looking for texts and not answers to start a
  conversation with an entity that knows what we really need to talk
  about. People learn from conversation and Google can't have one. It
  can pretend to have one using Siri but really those conversations tend
  to get tiresome when you are past asking about where to eat.
  #+end_quote
  
  Roger Schank -
  [[http://www.rogerschank.com/fraudulent-claims-made-by-IBM-about-Watson-and-AI][Fraudulent
  claims made by IBM about Watson and AI]]
- *[[hacker-ethics][2018-04-06]]* - hacker-ethics
  <<hacker-ethics>>
  
  #+begin_quote
  
  
  - Access to computers---and anything that might teach you something
    about the way the world works---should be unlimited and total.
    Always yield to the Hands-On Imperative!
  - All information should be free.
  - Mistrust Authority---Promote Decentralization.
  - Hackers should be judged by their hacking, not bogus criteria such
    as degrees, age, race, or position.
  - You can create art and beauty on a computer.
  - Computers can change your life for the better.
  #+end_quote
  
  [[https://en.wikipedia.org/wiki/Hacker_ethic][The Hacker Ethic]],
  [[https://en.wikipedia.org/wiki/Hackers:_Heroes_of_the_Computer_Revolution][Hackers:
  Heroes of Computer Revolution]], by Steven Levy
- *[[static-site-generator][2018-03-23]]* - static-site-generator
  <<static-site-generator>>
  
  #+begin_quote
  "Static site generators seem like music databases, in that everyone
  eventually writes their own crappy one that just barely scratches the
  itch they had (and I'm no exception)."
  #+end_quote
  
  __david__@hackernews
  
  So did I.