Internet Research (with tangent on intelligence analysis and collapse)

Arkanj3l31 Jul 2013 4:58 UTC

16 points

Want to save time? Skip down to “I’m looking to compile a thread on Internet Research”!

Opinionated Preamble:

There is a lot of high level thinking on Less Wrong, which is great. It’s done wonders to structure and optimize my own decisions. I think the political and futurology-related issues that Less Wrong cover can sometimes get out of sync with the reality and injustices of events in the immediate world. There are comprehensive treatments of how medical science is failing, or how academia cannot give unbiased results, and this is the milieu of programmers and philosophers in the middle-to-upper-class of the planet. I at least believe that this circle of awareness can be expanded, even if it’s treading into mind-killing territory. If anything I want to give people a near-mode sense of the stakes aside from x-risk: all in all the x-risk scenarios I’ve seen Less Wrong fear the most, kill humanity somewhat instantly. A slower descent into violence and poverty is to me much more horrifying, because I might have to live in it and I don’t know how. In a matter of fact, I have no idea of how to predict it.

This is one reason why I’m drawn to the Intelligence Operations performed by the military and crime units, among other things. Intelligence product delivery is about raw and immediate *fact*, and there is a lot of it. The problems featured in IntelOps are one of the few things rationality is good for—highly uncertain scenarios with one-off executions and messy or noisy feedback. Facts get lost in translation as messages are passed through, and of course the feeding and receiving fake facts are all a part of the job—but nevertheless, knowing *everything* *everywhere* is in the job description, and some form of rationality became a necessity.

It gets ugly. The demand for these kinds of skills often lie in industries that are highly competitive, violent, and illegal. I believe that once a close look is taken on how force and power is applied in practice then there isn’t any pretending anymore that human evils are an accident.

Open Source Intelligence, or “OSINT”, is the mining of data and facts from public information databases, news articles, codebases, journals. Although the amount of classified data dwarfs the unclassified, the size and scope of the unclassified is responsible for a majority of intelligence reports—and thus is involved in the great majority of executive decisions made by government entities. It’s worth giving some thought as to how much that we know, that they do too. As illustrated in this expose, the processing of OSINT is a great big chunk of what modern intelligence is about aside from many other things. I think understanding how rationality as developed on Less Wrong can contribute to better IntelOps, and how IntelOps can feed the rationality community, would be awesome, but that’s a post for another time.

The Show

Through my investigations into IntelOps I’ve noticed the emphasis on search. Good search.

I’m looking to compile a thread on Internet Research. I’m wondering if there is any wisdom on Less Wrong that can be taken advantage of here on how to become more effective searchers. Here are some questions that could be answered specifically, but they are just guidelines—feel free to voice associated thoughts, we’re exploring here.

Before actually going out and searching, what would be the most effective way of drafting and optimizing a collection plan? Are there any formal optimization models that inform our distribution of time and attention? Exploration vs exploitation comes to mind, but it would be worth formulating something specific. I heard that the multi-armed bandit problem is solved?

Do you have any links or resources regarding more effective search?

Do you have any experiences regarding internet research that you can share? Any patterns that you’ve noticed that have made you more effective at searching?

What are examples of closed-source information that are low-hanging fruit in terms of access (e.g. academic journals)? What are possible strategies for acquiring closed source data (e.g. enrolling in small courses at universities, e-mailing researchers, cohesion via the law/Freedom of Information Act, social engineering etc)?

I would like to hear from SEOs and software developers on what their interpretation of semantic web technologies and how they are going to affect end-users. I am somewhat unfamiliar with the semantic web, but from my understanding information that could not be indexed is now indexed; and new ontologies will emerge as this information is mined. What should an end-user expect and what opportunities will there be that didn’t exist in the current generation of search?

That should be enough to get started. Below are some links that I have found useful with respect to Internet Research.

Meta-Search Engines or Assisted Search:

Carrot—http://search.carrot2.org/stable/search (concept clustering search engine)

Summarizers:

TextTeaser—http://www.textteaser.com/ - SOURCE: https://github.com/MojoJolo/textteaser
Copernic (Commercial Summarizing Feed Program) - http://www.copernic.com/en/products/summarizer/

Bots/Collectors/Automatic Filters:

Google Alerts—http://www.google.ca/alerts
Change Detection—http://www.changedetection.com/

Compilations and Directories:

Directories and Search Engine Repository—http://rr.reuser.biz/index.html (probably the last one you’ll ever need.)

How to Perform Industry Research—http://businesslibrary.uflib.ufl.edu/industryresearch

Guides:

Google Guide—http://www.googleguide.com/ (with practice and tutorials)

From UC Berkeley—http://www.lib.berkeley.edu/TeachingLib/Guides/Internet/FindInfo.html

“How to Solve Impossible Problems”—http://www.johntedesco.net/blog/2012/06/21/how-to-solve-impossible-problems-daniel-russells-awesome-google-search-techniques/

The NSA Guide to “Untangling the Web”; Internet Research—http://www.nsa.gov/public_info/_files/Untangling_the_Web.pdf [C. 2007]

Fravia’s Learnings on searching (value in essays) - http://search.lores.eu/indexo.htm [C. 1990s − 2009]

“Power Searching With Google” Course—http://www.powersearchingwithgoogle.com/

Practice:

SearchReSearch—http://searchresearch1.blogspot.ca/

A Google A Day—http://agoogleaday.com/

I don’t really care how you use this information, but I hope I’ve jogged some thinking of why it could be important.

Arkanj3l31 Jul 2013 4:58 UTC

16 points

23 comments3 min readLW link Archive

[ ]
[deleted]
- Richard_Kennaway 5 Aug 2013 11:58 UTC
  5 points
  0
  Parent
  So, you don’t want this community, and judging by the karma scores, this community doesn’t want you.
  
  There is an obvious win-win solution.
- ChristianKl 5 Aug 2013 15:29 UTC
  1 point
  0
  Parent
  
  If discussions about how best to accomplish this are “mind killing” then the smartest people in our civilization (no, not the posters on this blog) are all brain-dead.
  
  Neither Yudkowsky nor me argues that discussions about how best to accomplish this are always “mind killing”. It something you think because you are angry and that anger prevents you from clearly understanding what other people are thinking.
  
  That anger mind killed you. It makes you ineffective at convincing other people.
  
  People in the real world don’t refer to intelligent political positions as “mind-killing.”
  
  Being angry is no intelligent political position.
  
  If you live in China you can achieve some political ends by getting a sufficent number of people angry. You can’t in Western democracies. Julian Assange made the point that most of the actual power is in contracts which just won’t change because someone is angry. I’m not 100% with Assange on the point but he said, that the reason free speech is legal in Western democracy while it isn’t in China is that political power in Western democracy is stable enough that you don’t change power structures with free speech.
  
  According to him the effort that a given government extends at suppressing a certain type of speech correlates to that speech potential to create substantial political change.
  
  Lesswrong is an echo chamber for people whose priorities are computer programming above all else. Such people will spend their lives programming mostly for other people, because their vision is too narrow to own their own lives.
  
  Actually no. There are plenty of people here no care about saving lifes in Africa through bet nets that have been proven to be effective. They are not angry that people die to malaria. They just calculate how they can safe as many lifes as possible and then engage in that cause of action.
  
  Then MIRI wants to prevent our world from getting destroyed by an unfriendly artificial intelligence and many people here think that’s a more important project than being angry that some political injustice.
  
  I myself did even do mainstream media interviews in Germany about QS where one of the points I make is that people shouldn’t rely on authorities but trust their own judgment. I’m no apolitical person. I however don’t let emotions like anger cloud my intellectual abilities to understand the world in all it’s shades of grey.
  
  When it comes to a topic like the war on drugs I know the background of politics that doubled the amount of marijuana that you can carry around in Berlin without getting charged with a crime. The people who acted there politically weren’t angry.
  
  In the US there are many places where medical marijuana polls much better than drug legislation. If you actually want to win politically it can make a lot of sense to focus on something like medical marijuana for which it’s easier to find a societal consensus than focusing on anger that everything isn’t legalised.
  
  Political successes need coalition building and that usually doesn’t happen from a place of extreme anger.
Tenoke 31 Jul 2013 10:19 UTC
4 points
0

What are examples of closed-source information that are low-hanging fruit in terms of access (e.g. academic journals)?

This post by me might be relevant.
VincentYu 31 Jul 2013 7:59 UTC
4 points
0
What are examples of closed-source information that are low-hanging fruit in terms of access (e.g. academic journals)?

On the topic of academic journals: I’m graduating from college next year and I want to maintain access to journals without going to grad school immediately. If I were to pay for journal access myself, it would cost me about $20,000 a year to sustain my current reading habits. I’d like to cut that down to below $10,000 (strictly legally).

I’ve only come up with two options so far:
1. Convince a university library or department to sponsor me and give me remote access to the university network.
2. Enroll at a college that has good journal subscriptions and cheap tuition (and provides VPN or EZproxy access to students who never arrive on campus...). Do any of the colleges offering online degrees give network access?
I hope option 1 works out. Are there other options for cheap, legal journal access?
- ChristianKl 31 Jul 2013 10:58 UTC
  9 points
  0
  Parent
  As far as I know at my own university the official alumni organisation provides alumni with the ability to VPN/proxy over the university.
  
  http://www.deepdyve.com/ is also worth a look if you don’t have access to a university. A free account allows you to view journal articles for five minutes. There also a 40$/month professional plan that gives you longer access to 40 articles per month.
  
  You could also pay a student to be able to use his VPN. I don’t know the legalities of it. It might be illegal. There might also be different laws in different countries.
  
  http://www.reddit.com/r/scholar is a source where you can ask for specific journal articles. But I don’t know the legality of the endevour.
  - VincentYu 31 Jul 2013 17:00 UTC
    5 points
    0
    Parent
    Great suggestions!
    
    As far as I know at my own university the official alumni organisation provides alumni with the ability to VPN/proxy over the university.
    
    That’s a prety good investment if I can enroll at a university that offers VPN for alumni. My university doesn’t let alumni access the network, and I think from a quick search that most US univerities don’t because of license restrictions. I’ll check out universities in other countries.
    
    http://www.deepdyve.com/ is also worth a look if you don’t have access to a university. A free account allows you to view journal articles for five minutes. There also a 40$/month professional plan that gives you longer access to 40 articles per month.
    
    Nice! This will be useful right now, so thanks for mentioning it. Unfortuntely, their journal selection is limited compared to a university library, and paper downloads are only 20% off the publisher price (and limited to 40 papers per month). I think I’ll try contacting them for custom bulk download plans.
    
    You could also pay a student to be able to use his VPN. I don’t know the legalities of it. It might be illegal. There might also be different laws in different countries.
    
    Account sharing is not allowed at my university, and I think most schools in the US don’t allow it.
    
    http://www.reddit.com/r/scholar is a source where you can ask for specific journal articles. But I don’t know the legality of the endevour.
    
    There’s also the Less Wrong help desk. Both are useful, but it takes time for a person to process requests, and neither are suitable for high-frequency requests.
    - maia 31 Jul 2013 17:38 UTC
      1 point
      0
      Parent
      /r/scholar is actually surprisingly fast on turnaround time. But it is questionably legal.
- asr 2 Aug 2013 3:41 UTC
  3 points
  0
  Parent
  
  I hope option 1 works out. Are there other options for cheap, legal journal access?
  
  Some big-city public libraries (New York, Boston etc) have journal subscriptions.
- Lumifer 31 Jul 2013 16:40 UTC
  3 points
  0
  Parent
  
  Are there other options for cheap, legal journal access?
  
  I found that Google works well. It’s rare that I find an article I want to read and can’t find somewhere—maybe on the author’s web page, maybe copied to some public directory, maybe someplace else. If everything else fails and you really need it, the authors are usually happy to email you a copy upon request.
  - gwern 31 Jul 2013 19:54 UTC
    5 points
    0
    Parent
    
    I found that Google works well. It’s rare that I find an article I want to read and can’t find somewhere—maybe on the author’s web page, maybe copied to some public directory, maybe someplace else.
    
    For you, perhaps. But for me… Well, I host 580 PDFs on gwern.net because they are not otherwise available publicly, and I link to >865 external PDFs (37 of which are Internet Archive or Dropbox, and would not be indexed in Google). So that’s easily a third of the articles which I use somewhere, I cannot simply find it online easily.
  - VincentYu 31 Jul 2013 17:13 UTC
    3 points
    0
    Parent
    I agree, papers are often publicly available somewhere indexed by Google, but I think that happens for less than half the papers I access.
    
    If everything else fails and you really need it, the authors are usually happy to email you a copy upon request.
    
    That’s a good point! However, authors are sometimes slow to respond, and most authors die (or, less drastically, some lose copies of and access to old papers).
[ ]
[deleted]
- ChristianKl 4 Aug 2013 15:53 UTC
  3 points
  0
  Parent
  
  If my anger is legitimate, then I would be “optimally angry” with your prioritization scheme.
  
  No. When you are angry you are operating in near mode. That’s no state in which you can make effective decisions to achieve political ends.
  
  Is it the position of Eliezer Yudkowsky that talking about politics in a nation where we are still free to speak, and to publicly assemble, is a mindkiller?
  
  How about actually reading the post about Yudkowsky mind killing? I don’t think you fully understand the position which he articulated in it. Probably because you got mind killed.
oooo 2 Aug 2013 2:31 UTC
3 points
0
Sorry this is a small nitpick. The main searchlores author is Fravia, not Favia. He was instrumental in providing a community and rallying point for various reversing groups. He was anonymous for quite some time, until he passed away in 2009.
Lumifer 31 Jul 2013 16:37 UTC
3 points
0

It’s worth giving some thought as to how much that we know, that they do too.

Why, yes, I believe a fellow by the name of Edward Snowden is interested in that subject, too :-/

Other than that, I find the subject of the thread to be far to wide. Searching is different from collecting (you can run your own net spiders without too much problems). Searching for people information is different from searching for scientific information which is different from searching for “that thing about which I have a vague memory that it mentioned X and Y, maybe...”.
- Arkanj3l 31 Jul 2013 17:39 UTC
  2 points
  0
  Parent
  I was mainly pointing out that the reliance on information that is accessible by most anybody is a benefit that levels the playing field, so to speak.
[ ]
[deleted]
- ChristianKl 4 Aug 2013 15:43 UTC
  2 points
  0
  Parent
  How about separating your posts into shorter paragraphs and focus more making specific points with each paragraph?
[ ]
[deleted]
- ChristianKl 7 Aug 2013 13:12 UTC
  1 point
  0
  Parent
  
  There is no politics without a fight, and that conflict is still well worth having. If intelligent minds don’t matter to the fight, then the fight would, by definition, not be worth having.
  
  Intelligent minds don’t fight battles because they see an injustice but because they think that they can have an effect by fighting a particular battle.
  
  To the topic Rop Gonggri speech at the 27th Chaos Computer Congress comes to mind. Rop was involved in Wikileaks and it was just a few months after the banking blockade against Wikileaks started.
  
  People ask me “Anonymous… That is the hackers striking back, right?” And then I have to explain that unlike Anonymous, people in this community would probably not issue press release with our real names in the PDF metadata. And that if this community were to get involved, the targets would probably be offline more often.
  
  This is a mental maturity issue: our community has generally succeeded in giving black belts in computer security karate only to people that have proven a certain level of mental maturity. Yes, some of us could probably do some real damage to Paypal and Mastercard. But then we also understand that no good comes from that. In the unlikely event that someone here has not yet reached this level of maturity, please do not connect your machine to the network and talk to some of the other people here for additional perspective.
  
  Children get angry when someone takes away there toys. In modern complicated political conflicts getting angry doesn’t help. It’s much better to have mental maturity and think things through.
  
  Understanding when blue/green thinking prevents you from accurately thinking about an issue and it mind kills you is important if you want to achieve political goals.
  
  I did see Julian Assange two times live in Berlin on the Chaos Computer Congress. I never did participate in Wikileaks myself but I have seen a bunch of people who did at the Chaos Computer Congress and know how they think politically. They aren’t the kind of people who are angry but rather think that you are naive for getting angry. They think of the anonymous crowd who does DDoS out of anger as immature.
  
  At the last LessWrong meetup I attended I did propose to another attendee a specific way to be politically active and a signifcant amount of energy in it.
  
  Yeah, I’m verbose, but newborn synthetic intelligences here shouldn’t have any trouble scanning what I’m saying.
  
  At the same time it shows that you don’t care enough about the ideas that you are advocating to write them in a way that’s more likely to be persuasive. You care more about signaling that you are angry than about doing something that wins political battles.
  
  It’s to engage the general populace rationally, using the time-tested means that have previously produced good results (informed democracy; agitation).
  
  You don’t engage people rationally by being angry. CFAR which came out of this community is engaging in trying to teach the general populace rational thinking. They don’t try to teach them political answers but they try to teach them to think for themselves.
  
  For all your pretense of wanting to push individuality you think that you should focus on teaching others your answers to political question instead of teaching them to think for themselves. CFAR goes another way.
  
  HPMoR is also quite political when it tries to hone down lessons such as the importance of taking responsibility to safe other people instead of just following a role. The way Harry tries to teach evil Draco rationality to turn him good is a proposal for political action.
[ ]
[deleted]
- Richard_Kennaway 7 Aug 2013 12:33 UTC
  1 point
  0
  Parent
  
  unbounded lifespan
  
  That just caught my eye in the Wall-O-Text(TM). Quoting the context:
  
  the central bank has these laws called “legal tender laws” that mean that you cannot pay taxes in gold and silver, and that such money is not legally used for banking. The primary importance of this isn’t that you can’t obtain gold or silver: you can. The primary importance of this is that you cannot access a wealth, savings, and capital-generating free market that has the ability to produce an unbounded lifespan for you at an affordable cost. The end result is that you die, when under other conditions, you would live. The official obfuscation of the issue is enough to confuse most people, and get them to choose death over resistance to civil government, and its corresponding possibility of life.
  
  Around here, “unbounded lifespan” means literally literally not dying. Is this what you meant, and if so, what is this real or hypothetical “wealth, savings, and capital-generating free market that has the ability to produce an unbounded lifespan for you”?
Arkanj3l 2 Aug 2013 6:44 UTC
1 point
0
Added GoogleGuide—http://www.googleguide.com/ (with practice and tutorials)
Arkanj3l 7 Aug 2013 3:39 UTC
0 points
0
Adding Carrot—a search engine which takes your query and creates dynamic clusters of websites that form around related concepts. It’s like a form of Google’s related searches that does the sorting for you. There are also visualizations that it can generate for you that allow proportionality comparisons.

This is an example query for ‘rationality’ and this one is Explore vs Exploit with a visualization on the side.
[ ]
[deleted]
- Arkanj3l 4 Aug 2013 8:16 UTC
  0 points
  0
  Parent
  I’ll say that I’m interested in what you have to offer just from the standpoint of novelty and exploration. However, your style doesn’t lend itself to brevity and even though thinking out loud is valuable, getting seven pages out has made me lose track of the point.
  
  I’m glad to see that on certain issues we are in intellectual agreement, but your writing style combined with the sheer amount of academic context you are bringing to the field makes any specific understanding very difficult. Although I am cursorily familiar with maybe a fourth of the authors you mentioned, I feel like in order to do justice for all of them I would need to read primary texts and get to know the literature better. This is something I currently don’t have the time or patience for.
  
  If there are any particular comments that you want to make to me, please do so in my private message box. I am open to picking your brain further and hearing what you have to say. Otherwise I would say that libertarian reform through quantum computers and courthouse bugs is outside the scope of this particular thread.
  
  Anyway, much appreciated. Namaste.
[ ]
[deleted]
- Document 7 Aug 2013 16:46 UTC
  −2 points
  0
  Parent
  
  When you’re 90 years old and frail is not the right time to decide you should have picked up a machine gun and refused to accept one more minor encroachment on your medical freedom.
  
  I was going to give a sympathy/contrarianism upvote, but you lost me when you got to advocating real-life violence. (Also, are you implying that “the FDA and AMA cartels” are suppressing treatments that would allow a 90-year-old to be in perfect health? Are you talking about cryonics?)
  
  Then again, you mention elsewhere that you wanted downvotes because you were afraid of “artilects” killing everyone with high karma. (Isn’t that the same kind of “cowardice” as doing what you’re told for fear of punishment?) So congratulations, I guess.
[deleted] 13 Jun 2014 11:39 UTC
−3 points
0
Remember not to overextend analytical techniques!

’”ACH is not appropriate for all types of decisions. It is used to analyze hypotheses about what is true or what is likely to happen. One might also want to evaluate alternative courses of action, such as alternative business strategies, which computer to buy, or where to retire. In such cases, this software is of limited value. The ACH matrix can be used to break such a decision problem down into its component parts, with alternative choices (comparable to hypotheses) across the top of the matrix and goals or values to be maximized by making the right choice (comparable to evidence) down the side. However, this type of analysis requires a different type of calculation. The principle of refuting hypotheses (in this case alternative courses of action) cannot be applied to a decision based on goals or personal preferences. One would need a more traditional analysis of the pros and cons of each alternative.

Original source: ACH manual by Richards Heuer

“(ACH) prods you to look for additional evidence you had not realized was relevant, helps you question assumptions, identifies the most lucrative future areas of investigation, and generally stimulates systematic and creative thinking about the issue at hand. ”

— Richards Heuer, ACH Creator”″