There is a pattern I see repeatedly with stories about North Korea that later turn out to be false. (Like the last time [Kim Jong-un was reported to be dead in 2016](https://www.nknews.org/2016/06/false-alarm-on-kim-jong-uns-death-shakes-south-korea/), or [the time he was reported to have been assassinated in Beijing in 2012](https://nationalpost.com/news/kim-jong-un-assassination-rumours-flood-twitter-weibo), or the time in 2008 "credible" publications like the BBC were giving airtime to a Japanese professor [who said Kim Jong-il died in 2003 and was re...
Repost from the other thread:
There is a pattern I see repeatedly with stories about North Korea that later turn out to be false. (Like the last time [Kim Jong-un was reported to be dead in 2016](https://www.nknews.org/2016/06/false-alarm-on-kim-jong-uns-death-shakes-south-korea/), or [the time he was reported to have been assassinated in Beijing in 2012](https://nationalpost.com/news/kim-jong-un-assassination-rumours-flood-twitter-weibo), or the time in 2008 "credible" publications like the BBC were giving airtime to a Japanese professor [who said Kim ...
> A book is said to have been written by a language model if a language model wrote at least 99% of the text contained in the main text in the book, excluding a potential foreword, copyright notice, table of contents, and other non-essential book sections. Stylistic
Two potential cop-out resolutions:
1. A foreign language book was translated into English by a language model (very plausible)
2. Non-linear: sentences remain unchanged, but their orderings are rearranged. Linear: passing the same input to the model multiple times and cherry-picking the be...
Looking at the discussion so far, the consensus seems to be that anyone familiar with chatbots, language models, and the current SOTA for each would be able to discern them from a human easily over the course of two hours by trying various strategies and tactics to get them to trip up. There are pieces of evidence that might indicate that the Judges are likely to be familiar with modern ML techniques:
1. Ray Kurzweil and Mitchell Kapor are appointing them according to their "best judgement" (document is vague about criteria)
2. Kurzweil argues: "The ...
I see regular updates here on KJU's absence from the Rodong Sinmun (currently 17 days). But that made me wonder: how often have absences occurred in the past? Is this especially unusual? Are people reading into it too much?
The RS's "Supreme Leaders' Activities" section with dates can be found [here](http://www.rodong.rep.kp/en/index.php?strPageID=SF01_01_02&iMenuID=1&iSubMenuID=1) (warning: it's slow as hell), so it should be pretty trivial to scrape and form a more accurate base rate – but here's what I saw via cursory examination of the first few pag...
So, between the last time anyone posted here and now, OpenAI released GPT-3. It doesn't do very well on Winograd which tests for common sense reasoning, which is what most people here are hung up on - albeit, they tested it in few-shot settings. But it did infer the rules of arithmetic even hampered by byte-pair encoding, where smaller models failed to do better than chance. For a very good discussion of GPT-3's limitations and its possibilities, I would recommend [gwern's post](https://www.gwern.net/GPT-3).
And the thing is, we're still getting large ...
Recent progress on each of these individual tasks for anyone who isn't caught up:
1. Loebner, Turing Test: https://ai.facebook.com/blog/state-of-the-art-open-source-chatbot/ - "In an A/B comparison between human-to-human and human-to-Blender conversations to measure engagement, models fine-tuned with BST tasks were preferred 49 percent of the time to humans"
2. Winograd: https://arxiv.org/abs/1910.10683 - Google's T5-11B is reported to achieve 93.8 WSC accuracy, Microsoft's Turing-NLG with 19 billion parameters is probably better.
3. SAT: https://alle...
Knowing that the volume of news about Biden's gaffes is making the community overestimate on this question (at least, according to the analysis below), should we maybe restrict the flood of information sources to what's relevant? I know that the secondary purpose is to boost the question higher but that should really only occur when a substantial update has happened...
Has anyone done a base-rate estimate for upsets in net worth rankings where 4th/5th placers took 1st? Data here goes back to 2000. It might be a good idea to open up questions titled something like "What will Jeff Bezos/Bernard Arnault/Bill Gates's highest/lowest net worth be at any point during the 2020s?" that would plug into this one.
someone on the KJU death thread made a collaborative Bayesian calculator based on the evidence we have so far. seems like we have a (very poor) version of this in the form of a comments section (overvalues recency, frequency, etc.).
but what if we formalised this? i'm imagining a diagram of the pieces of evidence which exist, priors and posteriors, p(a|b) for each, the way different pieces of evidence connect to each other, and so on. it could lead to more accurate predictions and more explicit tracking of what judgements are going into probability esti...
* would the federal government's credit rating be likely to be affected in the event of a constitutional crisis?
* a poll asking americans who they believe is the current president, split down the middle?
* typically foreign diplomats issue generic well-wishing statements when a new president is elected - could we take an absence of consensus to indicate a crisis?
* very broad - but market volatility in the month following the election could be an indicator. likewise, we could use a cluster of google trends keywords
i'm trying to think of external b...
Highest photovoltaic efficiency deployed in production by a solar manufacturer by 2030?
Largest carbon sequestration or filtering operation by thousands of tonnes removed annually by 2040?
ExxonMobil: largest share value decline over 2 year period before 2038?
Next recession catalysed by climate change related event?
Fortune 500: how many carbon neutral or carbon negative by 2025?
4 degrees of warming above pre-ind levels before 2060? Rise of half a degree or more within a three year period? Decline?
Major geoengineering project aimed at halting...
@(Uncle Jeff) my original reasoning was the UN membership condition would serve to disambiguate borderline cases. the two cases I was thinking of were
1. microstates with questionable legitimacy (e.g group asserts that they have created a state whose territory extends over a small area, but is not recognised by anybody)
2. situations of contested power (e.g group has *de facto* but not actual power over a certain region, or vice versa; there are multiple groups claiming to be legitimate authorities; there are multiple groups with *de facto* power over...
In case you didn't see it on the front page of r/ML
https://arxiv.org/abs/2002.05645
> Widely popular transformer-based NLP models such as BERT and Turing-NLG have enormous capacity trending to billions of parameters. Current execution methods demand brute-force resources such as HBM devices and high speed interconnectivity for data parallelism. In this paper, we introduce a new relay-style execution technique called L2L (layer-to-layer) where at any given moment, the device memory is primarily populated only with the executing layer(s)'s footprint. Th...
Glad to see Metaculusers at least are more resilient to spurious claims this time around (and the self-amplifying media spectacle that inevitably surrounds them) than Predictit, which updated from 90 to 75.
Here's one that might be interesting given the paucity of information coming out of the country: will North Korea transition to a market economy by 2040?
We've seen some speculation that the leadership is opting for a [Vietnamese-style Doi Moi policy](https://www.cnbc.com/2019/02/13/north-korea-may-choose-to-follow-vietnams-economic-model.html), that China is pushing them in that direction, that they've got a booming (unrecognised) internal market already, and that their diplomatic overtures to the U.S have the end goal of lifting all sanctions.
But h...