1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
|
#+title: Yuchen's Microblog
*** 2020-08-02: ia-lawsuit
:PROPERTIES:
:CUSTOM_ID: ia-lawsuit
:END:
The four big publishers Hachette, HarperCollins, Wiley, and Penguin
Random House are still pursuing Internet Archive.
#+begin_quote
[Their] lawsuit does not stop at seeking to end the practice of
Controlled Digital Lending. These publishers call for the destruction
of the 1.5 million digital books that Internet Archive makes available
to our patrons. This form of digital book burning is unprecedented and
unfairly disadvantages people with print disabilities. For the blind,
ebooks are a lifeline, yet less than one in ten exists in accessible
formats. Since 2010, Internet Archive has made our lending library
available to the blind and print disabled community, in addition to
sighted users. If the publishers are successful with their lawsuit,
more than a million of those books would be deleted from the
Internet's digital shelves forever.
#+end_quote
[[https://blog.archive.org/2020/07/29/internet-archive-responds-to-publishers-lawsuit/][Libraries
lend books, and must continue to lend books: Internet Archive responds
to publishers' lawsuit]]
*** 2020-08-02: fsf-membership
:PROPERTIES:
:CUSTOM_ID: fsf-membership
:END:
I am a proud associate member of Free Software Freedom. For me the
philosophy of Free Software is about ensuring the enrichment of a
digital commons, so that knowledge and information are not concentrated
in the hands of selected privileged people and locked up as
"intellectual property". The genius of copyleft licenses like GNU (A)GPL
ensures software released for the public, remains public. Open source
does not care about that.
If you also care about the public good, the hacker ethics, or the spirit
of the web, please take a moment to consider joining FSF as an associate
member. It comes with [[https://www.fsf.org/associate/benefits][numerous
perks and benefits]].
*** 2020-06-21: how-can-you-help-ia
:PROPERTIES:
:CUSTOM_ID: how-can-you-help-ia
:END:
[[https://blog.archive.org/2020/06/14/how-can-you-help-the-internet-archive/][How
can you help the Internet Archive?]] Use it. It's more than the Wayback
Machine. And get involved.
*** 2020-06-12: open-library
:PROPERTIES:
:CUSTOM_ID: open-library
:END:
Open Library was cofounded by Aaron Swartz. As part of the Internet
Archive, it has done good work to spread knowledge. However it is
currently
[[https://arstechnica.com/tech-policy/2020/06/internet-archive-ends-emergency-library-early-to-appease-publishers/][being
sued by four major publishers]] for the
[[https://archive.org/details/nationalemergencylibrary][National
Emergency Library]]. IA decided to
[[https://blog.archive.org/2020/06/10/temporary-national-emergency-library-to-close-2-weeks-early-returning-to-traditional-controlled-digital-lending/][close
the NEL two weeks earlier than planned]], but the lawsuit is not over,
which in the worst case scenario has the danger of resulting in
Controlled Digital Lending being considered illegal and (less likely)
bancruptcy of the Internet Archive. If this happens it will be a big
setback of the free-culture movement.
*** 2020-04-15: sanders-suspend-campaign
:PROPERTIES:
:CUSTOM_ID: sanders-suspend-campaign
:END:
Suspending the campaign is different from dropping out of the race.
Bernie Sanders remains on the ballot, and indeed in his campaign
suspension speech he encouraged people to continue voting for him in the
democratic primaries to push for changes in the convention.
*** 2019-09-30: defense-stallman
:PROPERTIES:
:CUSTOM_ID: defense-stallman
:END:
Someone wrote a bold article titled
[[https://geoff.greer.fm/2019/09/30/in-defense-of-richard-stallman/]["In
Defense of Richard Stallman"]]. Kudos to him.
Also, an interesting read:
[[https://cfenollosa.com/blog/famous-computer-public-figure-suffers-the-consequences-for-asshole-ish-behavior.html][Famous
public figure in tech suffers the consequences for asshole-ish
behavior]].
*** 2019-09-29: stallman-resign
:PROPERTIES:
:CUSTOM_ID: stallman-resign
:END:
Last week Richard Stallman resigned from FSF. It is a great loss for the
free software movement.
The apparent cause of his resignation and the events that triggered it
reflect some alarming trends of the zeitgeist. Here is a detailed review
of what happened: [[https://sterling-archermedes.github.io/][Low grade
"journalists" and internet mob attack RMS with lies. In-depth review.]].
Some interesting articles on this are:
[[https://jackbaruth.com/?p=16779][Weekly Roundup: The Passion Of Saint
iGNUcius Edition]],
[[http://techrights.org/2019/09/17/rms-witch-hunt/][Why I Once Called
for Richard Stallman to Step Down]].
Dishonest and misleading media pieces involved in this incident include
[[https://www.thedailybeast.com/famed-mit-computer-scientist-richard-stallman-defends-epstein-victims-were-entirely-willing][The
Daily Beast]],
[[https://www.vice.com/en_us/article/9ke3ke/famed-computer-scientist-richard-stallman-described-epstein-victims-as-entirely-willing][Vice]],
[[https://techcrunch.com/2019/09/16/computer-scientist-richard-stallman-who-defended-jeffrey-epstein-resigns-from-mit-csail-and-the-free-software-foundation/][Tech
Crunch]],
[[https://www.wired.com/story/richard-stallmans-exit-heralds-a-new-era-in-tech/][Wired]].
*** 2019-03-16: decss-haiku
:PROPERTIES:
:CUSTOM_ID: decss-haiku
:END:
#+begin_quote
#+begin_example
Muse! When we learned to
count, little did we know all
the things we could do
some day by shuffling
those numbers: Pythagoras
said "All is number"
long before he saw
computers and their effects,
or what they could do
by computation,
naive and mechanical
fast arithmetic.
It changed the world, it
changed our consciousness and lives
to have such fast math
available to
us and anyone who cared
to learn programming.
Now help me, Muse, for
I wish to tell a piece of
controversial math,
for which the lawyers
of DVD CCA
don't forbear to sue:
that they alone should
know or have the right to teach
these skills and these rules.
(Do they understand
the content, or is it just
the effects they see?)
And all mathematics
is full of stories (just read
Eric Temple Bell);
and CSS is
no exception to this rule.
Sing, Muse, decryption
once secret, as all
knowledge, once unknown: how to
decrypt DVDs.
#+end_example
#+end_quote
Seth Schoen, [[https://en.wikipedia.org/wiki/DeCSS_haiku][DeCSS haiku]]
*** 2019-01-27: learning-undecidable
:PROPERTIES:
:CUSTOM_ID: learning-undecidable
:END:
My take on the
[[https://www.nature.com/articles/s42256-018-0002-3][Nature paper
/Learning can be undecidable/]]:
Fantastic article, very clearly written.
So it reduces a kind of learninability called estimating the maximum
(EMX) to the cardinality of real numbers which is undecidable.
When it comes to the relation between EMX and the rest of machine
learning framework, the article mentions that EMX belongs to "extensions
of PAC learnability include Vapnik's statistical learning setting and
the equivalent general learning setting by Shalev-Shwartz and
colleagues" (I have no idea what these two things are), but it does not
say whether EMX is representative of or reduces to common learning
tasks. So it is not clear whether its undecidability applies to ML at
large.
Another condition to the main theorem is the union bounded closure
assumption. It seems a reasonable property of a family of sets, but then
again I wonder how that translates to learning.
The article says "By now, we know of quite a few independence [from
mathematical axioms] results, mostly for set theoretic questions like
the continuum hypothesis, but also for results in algebra, analysis,
infinite combinatorics and more. Machine learning, so far, has escaped
this fate." but the description of the EMX learnability makes it more
like a classical mathematical / theoretical computer science problem
rather than machine learning.
An insightful conclusion: "How come learnability can neither be proved
nor refuted? A closer look reveals that the source of the problem is in
defining learnability as the existence of a learning function rather
than the existence of a learning algorithm. In contrast with the
existence of algorithms, the existence of functions over infinite
domains is a (logically) subtle issue."
In relation to practical problems, it uses an example of ad targeting.
However, A lot is lost in translation from the main theorem to this ad
example.
The EMX problem states: given a domain X, a distribution P over X which
is unknown, some samples from P, and a family of subsets of X called F,
find A in F that approximately maximises P(A).
The undecidability rests on X being the continuous [0, 1] interval, and
from the insight, we know the problem comes from the cardinality of
subsets of the [0, 1] interval, which is "logically subtle".
In the ad problem, the domain X is all potential visitors, which is
finite because there are finite number of people in the world. In this
case P is a categorical distribution over the 1..n where n is the
population of the world. One can have a good estimate of the parameters
of a categorical distribution by asking for sufficiently large number of
samples and computing the empirical distribution. Let's call the
estimated distribution Q. One can choose the from F (also finite) the
set that maximises Q(A) which will be a solution to EMX.
In other words, the theorem states: EMX is undecidable because not all
EMX instances are decidable, because there are some nasty ones due to
infinities. That does not mean no EMX instance is decidable. And I think
the ad instance is decidable. Is there a learning task that actually
corresponds to an undecidable EMX instance? I don't know, but I will not
believe the result of this paper is useful until I see one.
h/t Reynaldo Boulogne
*** 2018-12-11: gavin-belson
:PROPERTIES:
:CUSTOM_ID: gavin-belson
:END:
#+begin_quote
I don't know about you people, but I don't want to live in a world
where someone else makes the world a better place better than we do.
#+end_quote
Gavin Belson, Silicon Valley S2E1.
I came across this quote in
[[https://slate.com/business/2018/12/facebook-emails-lawsuit-embarrassing-mark-zuckerberg.html][a
Slate post about Facebook]]
*** 2018-10-05: margins
:PROPERTIES:
:CUSTOM_ID: margins
:END:
With Fermat's Library's new tool
[[https://fermatslibrary.com/margins][margins]], you can host your own
journal club.
*** 2018-09-18: rnn-turing
:PROPERTIES:
:CUSTOM_ID: rnn-turing
:END:
Just some non-rigorous guess / thought: Feedforward networks are like
combinatorial logic, and recurrent networks are like sequential logic
(e.g. data flip-flop is like the feedback connection in RNN). Since NAND
+ combinatorial logic + sequential logic = von Neumann machine which is
an approximation of the Turing machine, it is not surprising that RNN
(with feedforward networks) is Turing complete (assuming that neural
networks can learn the NAND gate).
*** 2018-09-07: zitierkartell
:PROPERTIES:
:CUSTOM_ID: zitierkartell
:END:
[[https://academia.stackexchange.com/questions/116489/counter-strategy-against-group-that-repeatedly-does-strategic-self-citations-and][Counter
strategy against group that repeatedly does strategic self-citations and
ignores other relevant research]]
*** 2018-09-05: short-science
:PROPERTIES:
:CUSTOM_ID: short-science
:END:
#+begin_quote
- ShortScience.org is a platform for post-publication discussion
aiming to improve accessibility and reproducibility of research
ideas.
- The website has over 800 summaries, mostly in machine learning,
written by the community and organized by paper, conference, and
year.
- Reading summaries of papers is useful to obtain the perspective and
insight of another reader, why they liked or disliked it, and their
attempt to demystify complicated sections.
- Also, writing summaries is a good exercise to understand the content
of a paper because you are forced to challenge your assumptions when
explaining it.
- Finally, you can keep up to date with the flood of research by
reading the latest summaries on our Twitter and Facebook pages.
#+end_quote
[[https://shortscience.org][ShortScience.org]]
*** 2018-08-13: darknet-diaries
:PROPERTIES:
:CUSTOM_ID: darknet-diaries
:END:
[[https://darknetdiaries.com][Darknet Diaries]] is a cool podcast.
According to its about page it covers "true stories from the dark side
of the Internet. Stories about hackers, defenders, threats, malware,
botnets, breaches, and privacy."
*** 2018-06-20: coursera-basic-income
:PROPERTIES:
:CUSTOM_ID: coursera-basic-income
:END:
Coursera is having
[[https://www.coursera.org/learn/exploring-basic-income-in-a-changing-economy][a
Teach-Out on Basic Income]].
*** 2018-06-19: pun-generator
:PROPERTIES:
:CUSTOM_ID: pun-generator
:END:
[[https://en.wikipedia.org/wiki/Computational_humor#Pun_generation][Pun
generators exist]].
*** 2018-06-15: hackers-excerpt
:PROPERTIES:
:CUSTOM_ID: hackers-excerpt
:END:
#+begin_quote
But as more nontechnical people bought computers, the things that
impressed hackers were not as essential. While the programs themselves
had to maintain a certain standard of quality, it was quite possible
that the most exacting standards---those applied by a hacker who
wanted to add one more feature, or wouldn't let go of a project until
it was demonstrably faster than anything else around---were probably
counterproductive. What seemed more important was marketing. There
were plenty of brilliant programs which no one knew about. Sometimes
hackers would write programs and put them in the public domain, give
them away as easily as John Harris had lent his early copy of
Jawbreaker to the guys at the Fresno computer store. But rarely would
people ask for public domain programs by name: they wanted the ones
they saw advertised and discussed in magazines, demonstrated in
computer stores. It was not so important to have amazingly clever
algorithms. Users would put up with more commonplace ones.
The Hacker Ethic, of course, held that every program should be as good
as you could make it (or better), infinitely flexible, admired for its
brilliance of concept and execution, and designed to extend the user's
powers. Selling computer programs like toothpaste was heresy. But it
was happening. Consider the prescription for success offered by one of
a panel of high-tech venture capitalists, gathered at a 1982 software
show: "I can summarize what it takes in three words: marketing,
marketing, marketing." When computers are sold like toasters, programs
will be sold like toothpaste. The Hacker Ethic notwithstanding.
#+end_quote
[[http://www.stevenlevy.com/index.php/books/hackers][Hackers: Heroes of
Computer Revolution]], by Steven Levy.
*** 2018-06-11: catalan-overflow
:PROPERTIES:
:CUSTOM_ID: catalan-overflow
:END:
To compute Catalan numbers without unnecessary overflow, use the
recurrence formula \(C_n = {4 n - 2 \over n + 1} C_{n - 1}\).
*** 2018-06-04: boyer-moore
:PROPERTIES:
:CUSTOM_ID: boyer-moore
:END:
The
[[https://en.wikipedia.org/wiki/Boyer–Moore_majority_vote_algorithm][Boyer-Moore
algorithm for finding the majority of a sequence of elements]] falls in
the category of "very clever algorithms".
#+begin_example
int majorityElement(vector<int>& xs) {
int count = 0;
int maj = xs[0];
for (auto x : xs) {
if (x == maj) count++;
else if (count == 0) maj = x;
else count--;
}
return maj;
}
#+end_example
*** 2018-05-30: how-to-learn-on-your-own
:PROPERTIES:
:CUSTOM_ID: how-to-learn-on-your-own
:END:
Roger Grosse's post
[[https://metacademy.org/roadmaps/rgrosse/learn_on_your_own][How to
learn on your own (2015)]] is an excellent modern guide on how to learn
and research technical stuff (especially machine learning and maths) on
one's own.
*** 2018-05-25: 2048-mdp
:PROPERTIES:
:CUSTOM_ID: 2048-mdp
:END:
[[http://jdlm.info/articles/2018/03/18/markov-decision-process-2048.html][This
post]] models 2048 as an MDP and solves it using policy iteration and
backward induction.
*** 2018-05-22: ats
:PROPERTIES:
:CUSTOM_ID: ats
:END:
#+begin_quote
ATS (Applied Type System) is a programming language designed to unify
programming with formal specification. ATS has support for combining
theorem proving with practical programming through the use of advanced
type systems. A past version of The Computer Language Benchmarks Game
has demonstrated that the performance of ATS is comparable to that of
the C and C++ programming languages. By using theorem proving and
strict type checking, the compiler can detect and prove that its
implemented functions are not susceptible to bugs such as division by
zero, memory leaks, buffer overflow, and other forms of memory
corruption by verifying pointer arithmetic and reference counting
before the program compiles. Additionally, by using the integrated
theorem-proving system of ATS (ATS/LF), the programmer may make use of
static constructs that are intertwined with the operative code to
prove that a function attains its specification.
#+end_quote
[[https://en.wikipedia.org/wiki/ATS_(programming_language)][Wikipedia
entry on ATS]]
*** 2018-05-20: bostoncalling
:PROPERTIES:
:CUSTOM_ID: bostoncalling
:END:
(5-second fame) I sent a picture of my kitchen sink to BBC and got
mentioned in the [[https://www.bbc.co.uk/programmes/w3cswg8c][latest
Boston Calling episode]] (listen at 25:54).
*** 2018-05-18: colah-blog
:PROPERTIES:
:CUSTOM_ID: colah-blog
:END:
[[https://colah.github.io/][colah's blog]] has a cool feature that
allows you to comment on any paragraph of a blog post. Here's an
[[https://colah.github.io/posts/2015-08-Understanding-LSTMs/][example]].
If it is doable on a static site hosted on Github pages, I suppose it
shouldn't be too hard to implement. This also seems to work more
seamlessly than [[https://fermatslibrary.com/][Fermat's Library]],
because the latter has to embed pdfs in webpages. Now fantasy time:
imagine that one day arXiv shows html versions of papers (through author
uploading or conversion from TeX) with this feature.
*** 2018-05-15: random-forests
:PROPERTIES:
:CUSTOM_ID: random-forests
:END:
[[https://lagunita.stanford.edu/courses/HumanitiesSciences/StatLearning/Winter2016/info][Stanford
Lagunita's statistical learning course]] has some excellent lectures on
random forests. It starts with explanations of decision trees, followed
by bagged trees and random forests, and ends with boosting. From these
lectures it seems that:
1. The term "predictors" in statistical learning = "features" in machine
learning.
2. The main idea of random forests of dropping predictors for individual
trees and aggregate by majority or average is the same as the idea of
dropout in neural networks, where a proportion of neurons in the
hidden layers are dropped temporarily during different minibatches of
training, effectively averaging over an emsemble of subnetworks. Both
tricks are used as regularisations, i.e. to reduce the variance. The
only difference is: in random forests, all but a square root number
of the total number of features are dropped, whereas the dropout
ratio in neural networks is usually a half.
By the way, here's a comparison between statistical learning and machine
learning from the slides of the Statistcal Learning course:
*** 2018-05-14: open-review-net
:PROPERTIES:
:CUSTOM_ID: open-review-net
:END:
Open peer review means peer review process where communications
e.g. comments and responses are public.
Like [[https://scipost.org/][SciPost]] mentioned in
[[/posts/2018-04-10-update-open-research.html][my post]],
[[https://openreview.net][OpenReview.net]] is an example of open peer
review in research. It looks like their focus is machine learning. Their
[[https://openreview.net/about][about page]] states their mission, and
here's [[https://openreview.net/group?id=ICLR.cc/2018/Conference][an
example]] where you can click on each entry to see what it is like. We
definitely need this in the maths research community.
*** 2018-05-11: rnn-fsm
:PROPERTIES:
:CUSTOM_ID: rnn-fsm
:END:
Related to [[#neural-turing-machine][a previous micropost]].
[[http://www.cs.toronto.edu/~rgrosse/csc321/lec9.pdf][These slides from
Toronto]] are a nice introduction to RNN (recurrent neural network) from
a computational point of view. It states that RNN can simulate any FSM
(finite state machine, a.k.a. finite automata abbr. FA) with a toy
example computing the parity of a binary string.
[[http://www.deeplearningbook.org/contents/rnn.html][Goodfellow et.
al.'s book]] (see page 372 and 374) goes one step further, stating that
RNN with a hidden-to-hidden layer can simulate Turing machines, and not
only that, but also the /universal/ Turing machine abbr. UTM (the book
referenced
[[https://www.sciencedirect.com/science/article/pii/S0022000085710136][Siegelmann-Sontag]]),
a property not shared by the weaker network where the hidden-to-hidden
layer is replaced by an output-to-hidden layer (page 376).
By the way, the RNN with a hidden-to-hidden layer has the same
architecture as the so-called linear dynamical system mentioned in
[[https://www.coursera.org/learn/neural-networks/lecture/Fpa7y/modeling-sequences-a-brief-overview][Hinton's
video]].
From what I have learned, the universality of RNN and feedforward
networks are therefore due to different arguments, the former coming
from Turing machines and the latter from an analytical view of
approximation by step functions.
*** 2018-05-10: math-writing-decoupling
:PROPERTIES:
:CUSTOM_ID: math-writing-decoupling
:END:
One way to write readable mathematics is to decouple concepts. One idea
is the following template. First write a toy example with all the
important components present in this example, then analyse each
component individually and elaborate how (perhaps more complex)
variations of the component can extend the toy example and induce more
complex or powerful versions of the toy example. Through such
incremental development, one should be able to arrive at any result in
cutting edge research after a pleasant journey.
It's a bit like the UNIX philosophy, where you have a basic system of
modules like IO, memory management, graphics etc, and modify / improve
each module individually (H/t [[http://nand2tetris.org/][NAND2Tetris]]).
The book [[http://neuralnetworksanddeeplearning.com/][Neutral networks
and deep learning]] by Michael Nielsen is an example of such approach.
It begins the journey with a very simple neutral net with one hidden
layer, no regularisation, and sigmoid activations. It then analyses each
component including cost functions, the back propagation algorithm, the
activation functions, regularisation and the overall architecture (from
fully connected to CNN) individually and improve the toy example
incrementally. Over the course the accuracy of the example of mnist
grows incrementally from 95.42% to 99.67%.
*** 2018-05-09: neural-turing-machine
:PROPERTIES:
:CUSTOM_ID: neural-turing-machine
:END:
#+begin_quote
One way RNNs are currently being used is to connect neural networks
more closely to traditional ways of thinking about algorithms, ways of
thinking based on concepts such as Turing machines and (conventional)
programming languages. [[https://arxiv.org/abs/1410.4615][A 2014
paper]] developed an RNN which could take as input a
character-by-character description of a (very, very simple!) Python
program, and use that description to predict the output. Informally,
the network is learning to "understand" certain Python programs.
[[https://arxiv.org/abs/1410.5401][A second paper, also from 2014]],
used RNNs as a starting point to develop what they called a neural
Turing machine (NTM). This is a universal computer whose entire
structure can be trained using gradient descent. They trained their
NTM to infer algorithms for several simple problems, such as sorting
and copying.
As it stands, these are extremely simple toy models. Learning to
execute the Python program =print(398345+42598)= doesn't make a
network into a full-fledged Python interpreter! It's not clear how
much further it will be possible to push the ideas. Still, the results
are intriguing. Historically, neural networks have done well at
pattern recognition problems where conventional algorithmic approaches
have trouble. Vice versa, conventional algorithmic approaches are good
at solving problems that neural nets aren't so good at. No-one today
implements a web server or a database program using a neural network!
It'd be great to develop unified models that integrate the strengths
of both neural networks and more traditional approaches to algorithms.
RNNs and ideas inspired by RNNs may help us do that.
#+end_quote
Michael Nielsen,
[[http://neuralnetworksanddeeplearning.com/chap6.html#other_approaches_to_deep_neural_nets][Neural
networks and deep learning]]
*** 2018-05-09: neural-nets-activation
:PROPERTIES:
:CUSTOM_ID: neural-nets-activation
:END:
#+begin_quote
What makes the rectified linear activation function better than the
sigmoid or tanh functions? At present, we have a poor understanding of
the answer to this question. Indeed, rectified linear units have only
begun to be widely used in the past few years. The reason for that
recent adoption is empirical: a few people tried rectified linear
units, often on the basis of hunches or heuristic arguments. They got
good results classifying benchmark data sets, and the practice has
spread. In an ideal world we'd have a theory telling us which
activation function to pick for which application. But at present
we're a long way from such a world. I should not be at all surprised
if further major improvements can be obtained by an even better choice
of activation function. And I also expect that in coming decades a
powerful theory of activation functions will be developed. Today, we
still have to rely on poorly understood rules of thumb and experience.
#+end_quote
Michael Nielsen,
[[http://neuralnetworksanddeeplearning.com/chap6.html#convolutional_neural_networks_in_practice][Neutral
networks and deep learning]]
*** 2018-05-08: sql-injection-video
:PROPERTIES:
:CUSTOM_ID: sql-injection-video
:END:
Computerphile has some brilliant educational videos on computer science,
like [[https://www.youtube.com/watch?v=ciNHn38EyRc][a demo of SQL
injection]], [[https://www.youtube.com/watch?v=eis11j_iGMs][a toy
example of the lambda calculus]], and
[[https://www.youtube.com/watch?v=9T8A89jgeTI][explaining the Y
combinator]].
*** 2018-05-08: nlp-arxiv
:PROPERTIES:
:CUSTOM_ID: nlp-arxiv
:END:
Primer Science is a tool by a startup called Primer that uses NLP to
summarize contents (but not single papers, yet) on arxiv. A developer of
this tool predicts in
[[https://twimlai.com/twiml-talk-136-taming-arxiv-w-natural-language-processing-with-john-bohannon/#][an
interview]] that progress on AI's ability to extract meanings from AI
research papers will be the biggest accelerant on AI research.
*** 2018-05-08: neural-nets-regularization
:PROPERTIES:
:CUSTOM_ID: neural-nets-regularization
:END:
#+begin_quote
no-one has yet developed an entirely convincing theoretical
explanation for why regularization helps networks generalize. Indeed,
researchers continue to write papers where they try different
approaches to regularization, compare them to see which works better,
and attempt to understand why different approaches work better or
worse. And so you can view regularization as something of a kludge.
While it often helps, we don't have an entirely satisfactory
systematic understanding of what's going on, merely incomplete
heuristics and rules of thumb.
There's a deeper set of issues here, issues which go to the heart of
science. It's the question of how we generalize. Regularization may
give us a computational magic wand that helps our networks generalize
better, but it doesn't give us a principled understanding of how
generalization works, nor of what the best approach is.
#+end_quote
Michael Nielsen,
[[http://neuralnetworksanddeeplearning.com/chap3.html#why_does_regularization_help_reduce_overfitting][Neural
networks and deep learning]]
*** 2018-05-07: learning-knowledge-graph-reddit-journal-club
:PROPERTIES:
:CUSTOM_ID: learning-knowledge-graph-reddit-journal-club
:END:
It is a natural idea to look for ways to learn things like going through
a skill tree in a computer RPG.
For example I made a
[[https://ypei.me/posts/2015-04-02-juggling-skill-tree.html][DAG for
juggling]].
Websites like [[https://knowen.org][Knowen]] and
[[https://metacademy.org][Metacademy]] explore this idea with added
flavour of open collaboration.
The design of Metacademy looks quite promising. It also has a nice
tagline: "your package manager for knowledge".
There are so so many tools to assist learning / research / knowledge
sharing today, and we should keep experimenting, in the hope that
eventually one of them will scale.
On another note, I often complain about the lack of a place to discuss
math research online, but today I found on Reddit some journal clubs on
machine learning:
[[https://www.reddit.com/r/MachineLearning/comments/8aluhs/d_machine_learning_wayr_what_are_you_reading_week/][1]],
[[https://www.reddit.com/r/MachineLearning/comments/8elmd8/d_anyone_having_trouble_reading_a_particular/][2]].
If only we had this for maths. On the other hand r/math does have some
interesting recurring threads as well:
[[https://www.reddit.com/r/math/wiki/everythingaboutx][Everything about
X]] and
[[https://www.reddit.com/r/math/search?q=what+are+you+working+on?+author:automoderator+&sort=new&restrict_sr=on&t=all][What
Are You Working On?]]. Hopefully these threads can last for years to
come.
*** 2018-05-02: simple-solution-lack-of-math-rendering
:PROPERTIES:
:CUSTOM_ID: simple-solution-lack-of-math-rendering
:END:
The lack of maths rendering in major online communication platforms like
instant messaging, email or Github has been a minor obsession of mine
for quite a while, as I saw it as a big factor preventing people from
talking more maths online. But today I realised this is totally a
non-issue. Just do what people on IRC have been doing since the
inception of the universe: use a (latex) pastebin.
*** 2018-05-01: neural-networks-programming-paradigm
:PROPERTIES:
:CUSTOM_ID: neural-networks-programming-paradigm
:END:
#+begin_quote
Neural networks are one of the most beautiful programming paradigms
ever invented. In the conventional approach to programming, we tell
the computer what to do, breaking big problems up into many small,
precisely defined tasks that the computer can easily perform. By
contrast, in a neural network we don't tell the computer how to solve
our problem. Instead, it learns from observational data, figuring out
its own solution to the problem at hand.
#+end_quote
Michael Nielsen -
[[http://neuralnetworksanddeeplearning.com/about.html][What this book
(Neural Networks and Deep Learning) is about]]
Unrelated to the quote, note that Nielsen's book is licensed under
[[https://creativecommons.org/licenses/by-nc/3.0/deed.en_GB][CC BY-NC]],
so one can build on it and redistribute non-commercially.
*** 2018-04-30: google-search-not-ai
:PROPERTIES:
:CUSTOM_ID: google-search-not-ai
:END:
#+begin_quote
But, users have learned to accommodate to Google not the other way
around. We know what kinds of things we can type into Google and what
we can't and we keep our searches to things that Google is likely to
help with. We know we are looking for texts and not answers to start a
conversation with an entity that knows what we really need to talk
about. People learn from conversation and Google can't have one. It
can pretend to have one using Siri but really those conversations tend
to get tiresome when you are past asking about where to eat.
#+end_quote
Roger Schank -
[[http://www.rogerschank.com/fraudulent-claims-made-by-IBM-about-Watson-and-AI][Fraudulent
claims made by IBM about Watson and AI]]
*** 2018-04-06: hacker-ethics
:PROPERTIES:
:CUSTOM_ID: hacker-ethics
:END:
#+begin_quote
- Access to computers---and anything that might teach you something
about the way the world works---should be unlimited and total.
Always yield to the Hands-On Imperative!
- All information should be free.
- Mistrust Authority---Promote Decentralization.
- Hackers should be judged by their hacking, not bogus criteria such
as degrees, age, race, or position.
- You can create art and beauty on a computer.
- Computers can change your life for the better.
#+end_quote
[[https://en.wikipedia.org/wiki/Hacker_ethic][The Hacker Ethic]],
[[https://en.wikipedia.org/wiki/Hackers:_Heroes_of_the_Computer_Revolution][Hackers:
Heroes of Computer Revolution]], by Steven Levy
*** 2018-03-23: static-site-generator
:PROPERTIES:
:CUSTOM_ID: static-site-generator
:END:
#+begin_quote
"Static site generators seem like music databases, in that everyone
eventually writes their own crappy one that just barely scratches the
itch they had (and I'm no exception)."
#+end_quote
__david__@hackernews
So did I.
|