[ nominal delivery draft, 21 May 15 ] . Driven by Data . LANGSEC Workshop . Dan Geer Hello, and thank you for the invitation to speak with you. What you are trying to do is important. And hard. Perhaps those two, important and hard, go together. Certainly cybersecurity in the general sense is hard. We have a pretty deep well of evidence that cybersecurity is hard. We have pretty good understanding of how important targets have a hard time defending themselves, particularly against attackers who are nation states or one notch off of nation states. At the risk of seeming oddly self-congratulatory, I've long thought that cybersecurity is the most difficult intellectual occupation on the planet as we have the dual challenges of rapid change and sentient opponents. Yet at the same time, in the cybersecurity occupation we certainly seem to be getting better and better. We have better tools, we have better understood practices, and we have more and better colleagues. That's the plus side. But from the point of view of prediction, what matters is the ratio of skill to challenge; as far as I can estimate, we are expanding the society-wide attack surface faster than we are expanding our collection of tools, practices, and colleagues. If your society is growing more food, that's great. If your population is expanding faster than your improvements in food production can keep up, that's bad. So it is with cyber risk management: Whether in detection, control, or prevention, we are notching personal bests, but all the while the opposition is setting world records. As with most decision making under uncertainty, statistics have a role, particularly ratio statistics that magnify trends so that the latency of feedback from policy changes is more quickly clear. Yet statistics, of course, require measurement, to which I will return in a moment. I do not have well-vetted figures on the total societal spend on cybersecurity, but it is clearly larger than ever before, yet so are the failures that we know about. According to a power law analysis of data breaches, the failures that we do not yet know about -- but in time will -- are likely to be worse than the ones we do know about already. A week ago today,[TH] "James Trainor, acting assistant director of the FBI's Cyber Division, said the agency used to learn about a new, large-scale data breach every two or three weeks. 'Now, it is close to every two to three days,' Trainor also said the cybersecurity industry needs to 'double or triple' its workforce in order to keep up with hacking threats." In short, that a storm is gathering is in some sense understood intuitively by the general public, even though they have become jaded to press reports on cybercrime. As it happens, my wife and I run a small business, which is irrelevant to this Workshop except for one thing: just this week one of our high-school aged employees told me that his guidance counselor advised him that the two best choices for what to study in college would be game design or cybersecurity because if you get good at either of those you can get jobs well paid enough that the student loan burden won't be impossible to manage. In other words, the idea that cybersecurity matters and that we are losing has reached the level of high school career advice. As you no doubt also know, what to do about cybersecurity has also reached the highest policy levels of government. Jockeying for position and authority aside, the question of "What is to be done?" is on the lips of a broad spectrum of government officials. Part of that is quite naturally that here in the United States we have the most to lose from cybersecurity failures for the simple, but frequently forgotten, reason that we depend on the digital realm more than any other society -- you are only at risk from something you depend on. Almost a year ago, the Pew Research Center invited 12,000 "experts" to answer a single Yes/No question: By 2025 will there be significant changes for the worse and hindrances to the ways in which people get and share content online compared with the way globally networked people can operate online today?[PEW] Of the 12,000 invited, some 1,400 did answer. Putting aside whatever selection bias may be reflected in who chose to answer and who did not, Pew found four themes dominated respondent comments: 1) Actions by nation-states to maintain security and political control will lead to more blocking, filtering, segmentation, and balkanization of the Internet. 2) Trust will evaporate in the wake of revelations about government and corporate surveillance and likely greater surveillance in the future. 3) Commercial pressures affecting everything from Internet architecture to the flow of information will endanger the open structure of online life. 4) Efforts to fix the "too much information" problem might over-compensate and actually thwart content sharing. My colleague Rob Lemos mapped Pew's themes to two alternative futures -- one where cyberspace is made to look more and more like meatspace, and the other where meatspace is made to look more and more like cyberspace. Lemos [RL] wrote that "If cyberspace converges to our physical reality, then we will have balkanization and commercial efforts to artificially create information monopolies, while if the physical world goes toward digital space, then we have greater surveillance, the erosion of trust, much information leakage, and the reaction to that leakage." More crucially, Lemos also observed that the growth of technology has greatly increased personal power: The impact that a single person can have on society has significantly increased over time to where a single individual can have a devastating effect. The natural reaction for government is to become more invasive {possibility #2 above} to better defend its monoculture, or more separate {possibility #1 above} to firewall threats from one another. Because threats and kinetic impacts can increasingly travel through the digital realm, they necessitate that the policy and legal frameworks of the digital and physical world converge. In other words, Lemos argues that convergence is an inevitable consequence of the very power of cyberspace in and of itself. I don't argue with Lemos' idea that increasingly powerful, location independent technology in the hands of the many will tend to force changes in the distribution of power. In fact, that is a central theme of this essay -- that the power that is growing in the net, per se, will soon surpass the ability of our existing institutions to modify it in any meaningful way, so either the net must be broken up into governable chunks or the net becomes government. It seems to me that the leverage here favors cyberspace whenever and wherever we give cyberspace a monopoly position, which we are doing both blindly and often. In the last couple of years, I've found that institutions that I more or less must use -- my 401(k) custodian, the Government Accounting Office's accounts payable department, the payroll service my employer outsources to, etc. -- no longer accept paper letter instructions, they each only accept digital delivery of such instructions. This means that each of them has created a critical dependence on an Internet swarming with men in the middle and, which is more, they have doubtlessly given up their own ability to fall back to what worked for a century before. It is that giving up of alternative means that really defines what convergence is and does. It is said that all civil wars are about on whose terms re-unification will occur. I would argue that we are in, to coin a phrase, a Cold Civil War to determine on whose terms convergence occurs. Everything in meatspace we give over to cyberspace replaces dependencies that are local and manageable with dependencies that are certainly not local and I would argue much less manageable because they are much less secure. I say that exactly because the root cause of risk is dependence, and most especially dependence on expectations of system state. I say "much less secure" because one is secure, that is to say that one is in a state of security, if and only if there can be no unmitigatable surprises. The more we put on the Internet, the broader and unmitigatable any surprises become. This line of thought is beginning to sink in. Let me quote from a Bloomberg article also nearly a year old:[CWC] Wall Street's biggest trade group has proposed a government-industry cyber war council to stave off terrorist attacks that could trigger financial panic by temporarily wiping out account balances, according to an internal document. The proposal by the Securities Industry and Financial Markets Association calls for a committee of executives and deputy-level representatives from at least eight U.S. agencies including the Treasury Department, the National Security Agency and the Department of Homeland Security, all led by a senior White House official. The document sketches an unusually frank and pessimistic view by the industry of its readiness for attacks wielded by nation-states or terrorist groups that aim to "destroy data and machines." It says the concerns are "compounded by the dependence of financial institutions on the electric grid," which is also vulnerable to physical and cyber attack. So here you have the biggest financial firms saying that their dependencies are no longer manageable, and that the State's monopoly on the use of force must be brought to bear. What they are talking about is that they have no way to mitigate the risk of common mode failure. To repeat, risk is a consequence of dependence. Because of shared dependence, aggregate societal dependence on the Internet is not estimable. If dependencies are not estimable, they will be underestimated. If they are underestimated, they will not be made secure over the long run, only over the short. As the risks become increasingly unlikely to appear, the interval between events will grow longer. As the latency between events grows, the assumption that safety has been achieved will also grow, thus fueling increased dependence in what is now a positive feedback loop. Accommodating old methods and Internet rejectionists preserves alternate, less complex, more durable means and therefore bounds dependence. Bounding dependence is *the* core of rational risk management. We show no signs of being willing to bound dependence, yet if we don't bound dependence, then we invite common mode failure. In the language of statistics, common mode failure comes exactly from under-appreciated mutual dependence. Quoting [NIST]: [R]edundancy is the provision of functional capabilities that would be unnecessary in a fault-free environment. Redundancy is necessary, but not sufficient for fault tolerance... System failures occur when faults propagate to the outer boundary of the system. The goal of fault tolerance is to intercept the propagation of faults so that failure does not occur, usually by substituting redundant functions for functions affected by a particular fault. Occasionally, a fault may affect enough redundant functions that it is not possible to reliably select a non-faulty result, and the system will sustain a common-mode failure. A common-mode failure results from a single fault (or fault set). Computer systems are vulnerable to common-mode resource failures if they rely on a single source of power, cooling, or I/O. A more insidious source of common-mode failures is a design fault that causes redundant copies of the same software process to fail under identical conditions. That last part -- that "A more insidious source of common-mode failures is a design fault that causes redundant copies of the same software process to fail under identical conditions" -- is exactly that which can be masked by complexity precisely because complexity ensures under-appreciated mutual dependence. And that is where LANGSEC comes in. LANGSEC is one of, if not the, answer to the question of common mode software failure, at least in circumstances where we have some idea what we are depending upon and thus under a common threat from its disappearance or perversion. I imagine that even in the lay world it is understood that the more we depend on something the more effort should have been put into making it bullet proof and/or fail safe. Take plane crashes. Plane crashes cannot be hidden so we learn something from each one. A few months after the November, 2001, crash of an airliner in Queens, New York, I had occasion to ask an FAA analyst about the NTSB finding that the pilot had broken the tail off the plane. I said that I had never heard of that before. He said "And you won't; we learn from mistakes." Well said. On the other hand, near misses are more frequent and generally they are not reported. That suggests we don't learn from them. Just as the National Transportation Safety Board takes mandatory reports of airliner crashes, the CDC takes mandatory reports of public health events, meaning immediate disclosure of patients who have smallpox or anthrax or the plague, not patients who have cancer or third degree burns. I proposed elsewhere[DG1] that the force of law require reporting of cybersecurity failures which are above some severity threshold that we have yet to negotiate, while below that threshold, we follow Richard Danzig's suggestion in "Surviving on a Diet of Poisoned Fruit," where he made this policy proposal:[RD] Fund a data collection consortium that will illuminate the character and magnitude of cyber attacks against the U.S. private sector, using the model of voluntary reporting of near-miss incidents in aviation. Use this enterprise as well to help develop common terminology and metrics about cybersecurity. While regulatory requirements for aviation accident reporting are firmly established through the National Transportation Safety Board, there are no requirements for reporting the vastly more numerous and often no less informative near misses. Efforts to establish such requirements inevitably generate resistance: Airlines would not welcome more regulation and fear the reputational and perhaps legal consequences of data visibility; moreover, near accidents are intrinsically more ambiguous than accidents. An alternative path was forged in 2007 when MITRE, a government contractor, established an Aviation Safety Information Analysis and Sharing (ASIAS) system receiving near-miss data and providing anonymized safety, benchmarking and proposed improvement reports to a small number of initially participating airlines and the Federal Aviation Administration (FAA). Today, 44 airlines participate in that program voluntarily. The combination of a mandatory CDC model for above-threshold cyber events and a voluntary ASIAS model for below-threshold events is what I recommend. This leaves a great deal of thinking still to be done; diseases are treated by professionals, but malware infections are treated by amateurs. Diseases spread within jurisdictions before they become global, but malware is global from the get-go. Diseases have predictable behaviors, but malware comes from sentient opponents. Don't think this or any proposal to ensure that we get a chance to learn from failure is an easy one or that it is devoid of side effects. Let me quote Francis Bacon; "Truth emerges more readily from error than from confusion." What I am arguing for is that we not keep ourselves in a state of confusion but that we graduate to a state of acknowledged error. Were we to do so, LANGSEC would clearly be part of the solution to the puzzle of cybersecurity as not only is LANGSEC a path to rigor, but also that it is a path away from confusion. And confusion we got. On several fronts. I'm working on an article about the rising undiscoverability of what is on the Internet at large. There are lots of examples, like how the fraction of all reachable hosts that have a DNS entry of any kind is falling, that carrier-NAT conceals a great deal of everything, that IPv6 trees are too big to walk, and/or what is an end in the end-to-end conceptualization is falling to the reality of VMs and, more importantly, containers. Put differently, we can no longer bound the attack surface in the network sense. Nor can we bound it in the software sense. It is fairly clear that as applications, especially web applications, grow, the complexity of interactions among their moving parts increases. The HTTP Archive says that the average web page today makes out-references to 16 different domains as well as making 17 Javascript requests per page, and the Javascript byte count is five times the HTML byte count.[HT] A lot of that Javascript is about analytics which is to say surveillance of the user "experience" (and we're not even talking about getting your visitors to unknowingly mine Bitcoin for you by adding Javascript to your website that does exactly that.[BJ]) My colleagues over at Veracode are seeing machine written code of vast sizes that contain apparent vulnerabilities -- meaning even machines write vulns. In a relatively recent Atlantic Monthly article,[BS] Bruce Schneier asked a cogent first-principles question: Are vulnerabilities in software dense or sparse? If they are sparse, then every one you find and fix meaningfully lowers the number of avenues of attack that are extant. If they are dense, then finding and fixing one more is essentially irrelevant to security and a waste of the resources spent finding it. Six-take-away-one is a 15% improvement. Six-thousand-take- away-one has no detectable value. Which is it? In one of my recent For Good Measure (security metrics) columns, I took a stab at applying the techniques of field biology to the question of whether vulnerabilities are sparse or dense.[DG2] That is one path to an answer of whether finding and fixing bugs is worth the effort. Sandy Clark, amongst others, takes a different tack arguing that exploitable flaws still take our opponents a while to find and to weaponize but if rollover of the code base can be accelerated, then you may be able to get the rollover time to be shorter than the opponents' reverse engineering time,[SC] in which case you win. Auto-update with the barest regression testing for operability might then supplant taking care for security, except that that would mean the more important the code the more often you would have to roll it over. Try selling that idea to your Compliance Officer or the upstream bureaucrats. One might argue that LANGSEC is specific to errors that are sparse but unobvious. That is a really big deal in the context where every day that goes by without some collosal upheaval causes amateur Bayesians to say, "See, the risk is small and that we had another day today when nothing bad happened means that the probability of something really bad is monotonically decreasing." That's not true for Poisson processes and that is not true for cyber warfare operatives doing the equivalent of mining the harbor. Let me put it differently. Our collective attack surface is increasing quickly in volume, in complexity, and in interlocking dependencies. To maintain constant risk in the context of an expanding attack surface requires commensurate decreases in points of exploit within that nominal attack surface. If we are going to double the number of applications or hosts or endpoints, then we must halve the attackability of those applications, hosts, or endpoints. That is not happening. Workshop organizer Meredith Patterson gave me a quotation from Taylor Hornby that I hadn't seen. In it, Hornby succinctly states the kind of confusion we are in and which LANGSEC is all about: The illusion that your program is manipulating its data is powerful. But it is an illusion: The data is controlling your program. As it happens, Hornby was one of this year's judges for the Underhanded Crypto Contest. The winner, John Meacham, writes a short enough description of the hidden exploit in his code for me to quote it almost entirely in full: This library has the very specific target of undermining the security of the "internet of things". It was designed to be attractive to being integrated into IoT hardware platforms where it will be difficult to patch once it is discovered. The backdoor also has a fairly strong degree of deniability. It arises from a rare, yet plausible bug. It would be very hard to establish malice as opposed to simple oversight on the programmer's part once the exploit is publicly discovered. The library implements the AES block cipher in CTR mode and for the most part behaves as it is described in almost all circumstances it would normally be used in. Notably it passes all the FIPS and RFC test vectors for AES encoding and can interoperate with openssl. The library can be used in most any application where AES is needed and will behave appropriately. The library becomes exploitable specifically when it gets used as part of an implementation of IPSec over IPv6 in a resource constrained IP stack. In such a case, under normal operation the library will perform properly, interoperating with other IPSec aes-ctr hosts. But when it is triggered by a forged ICMPv6 packet it will lead to a full plaintext reveal. Another forged ICMP packet will restore the conforming behavior so that normal packet retries will mask the improper behavior. The bug is triggered when an encoded stream is split on a boundry that is not divisible by 16 and then subsequently restarted. When this is done, the key schedule will be reseeded by the current IV. In practice, this will not come up due to buffers almost always being a power of two greater than 16 when data is split up or when an odd size is needed, it is due to the end of an encryption stream and not one that will be restarted. The compromised packets will be dropped by the recieving host as corrupted in flight due to the MAC not agreeing and the packets will be resent with the new 16 byte aligned fragments properly encoded. By flitting back and forth between MTUs, an eavesdropper can obtain almost the entire plaintext of an encrypted transaction. This is precisely the thing that LANGSEC is all about, a weird machine triggered by specific input in a specific setting with the extra feature that a different specific input can put the system back into normal operation, attractive for integration into devices too small or distant to be patchable after discovery, and plausibly deniable. Is that prizeworthy or what? Workshop organizer Sergey Bratus wrote me saying almost the same thing: Even systems for which we have constructed correctness proofs in the C.A.R. Hoare sense, [those proofs] are... for _specific precoditions_, and the proofs tend to say nothing of _what_ and _how much of_ unwanted computation these proven pieces of code would be capable when the preconditions do not fully hold, for reason of programmer error or some other reason. There appears to be a whole universe of unwanted computation dual and larger than that of the intended computation, and this "dark computation" universe is expanding. Along with my still somewhat back-of-the-envelope calcuations of the scale of dark reachability in networks, the dark reachability in code space may be growing faster still. It would surprise me if the huge investment in offensive cyber technology were not a proxy measure of the growth in some combination of darkly reachable networks and darkly reachable code space. That offense budgets for this are growing as fast as they appear to be has to signal some degree of outcomes favorable to those who stockpile cyber weapons. As many of you know, research programs do not grow when they are not getting results. Just to pound that home, Chris Inglis, recently retired NSA Deputy Director, said to me that if we were to score cyber the way we score soccer, the tally would be 462-456 twenty minutes into the game, i.e., all offense.[CI] I will take his comment as confirming at the highest level that offense is where the innovations that only States can afford is going on, and that they have results which I'd wager are all but entirely weird machines. It almost appears that we are building weird machines on purpose, almost the weirder the better. Take big data and deep learning. Where data science spreads, a massive increase in tailorability to conditions follows. But even if Moore's Law remains forever valid, there will never be enough computing hence data driven algorithms must favor efficiency above all else, yet the more efficient the algorithm, the less interrogatable it is,[MO] that is to say that the more optimized the algorithm is, the harder it is to know what the algorithm is really doing.[SFI] And there is a feedback loop here: The more desirable some particular automation is judged to be, the more data it is given. The more data it is given, the more its data utilization efficiency matters. The more its data utilization efficiency matters, the more its algorithms will evolve to opaque operation. Above some threshold of dependence on such an algorithm in practice, there can be no going back. As such, if science wishes to be useful, preserving algorithm interrogatability despite efficiency-seeking, self-driven evolution is the research grade problem now on the table. If science does not pick this up, then Lessig's characterization of code as law[LL] is fulfilled. But if code is law, what is a weird machine? Note that these algorithms do still specify a language of "good and valid" inputs, but as this specification is implicit and noninterrogatable, there is no way to actually verify the specification either for required properties or for correctness of implementation. This underscores what I understand to be a couple of LANGSEC slogans: "Ambiguity is insecurity" and "Explicit is better than implicit." Suppose for a moment, though, that we knew how to do compliance testing for some set of LANGSEC principles. Would we find parser and parser differential bugs sparse or dense? Would that be true in web applications more than in firewall code? What would we discover? Do we even know whether fuzzing is effective or how effective it is? If I fuzz the daylights out of some input-accepting code and simply nothing happens, can I make the claim that the absence of evidence is the evidence of absence? Almost surely not, which is to say that execution models including execution models for exploits do not fall out of fuzzing. For our purposes, does fuzzing even matter? I would not count on it to estimate the sparseness or density of exploitable flaws. So does a complexity measure like maybe McCabe's cyclomatic complexity, perhaps, tell us anything predictive about the weird machines a given code body might harbor, an upper or lower bound on their density, say? Or is it not code complexity that matters, that we should instead look at protocol complexity? Does any protocol that has complex enough data -- such as ASN.1 and perhaps especially BER -- necessarily become a fount of bugs once approached deeply enough? What did we learn from OpenSSL, from HeartBleed? John Quarterman[JQ] estimates that as much as 10% of Internet backbone traffic is unidentifiable as to protocol. Even if he is wrong about 10%, it is still real. What share of that traffic is peer to peer? What share of that traffic is coming from a test harness for finding weird machines? Is its very unidentifiability a signature for parser targeting? Some of you may know one of the core ideas about the early Internet, something known as the Robustness Principle or, more often, as Postel's Law, which reads "Be conservative in what you send, [but] be liberal in what you accept." That is not exactly an engraved invitation to weird machine deployment, but you can read it that way. In 2011, Eric Allman wrote a reconsideration of the Robustness Principle[EA] in which he argued that because interoperability and security are in a natural tension, "be conservative in what you accept" has likely become the necessary rule as the spread of the Internet has, in so many words, made every sociopath your next door neighbor. Coming from the original author of Sendmail, Allman's conclusion has the weight of some considerable experience. But as the cliche goes, you ain't seen nothing yet. New protocols and new stacks are appearing every day, or so it seems. One of the venture firms I deal with says that they are tracking something over 1000 security startups. I've not read a thousand business plans, but I have read a lot of them, and there is no surer sign that we are in need of new disciplines. A great number of those business plans are tightly bound to some very specific problem. A great number of them are, as I said early, part and parcel of the deep learning sort where, in the end, your choice is to either trust the algorithms or not -- there is no interrogating them. Many of the business plans start from the hopeful position that anomaly detection of one sort or another is the essential key to cybersecurity, but each of those anomaly detectors has a training problem in the form of "if we pick the wrong week to call normal, then the attacker who is already inside is now in our scope of normal." Would you go so far as to say that deriving security algorithms from data optimistically called "normal operation" results in a weird machine if there is an attacker already present when you take your snapshot of normality? Perhaps and perhaps not, but trojaned hardware might be easier to describe this way and without argument. A correspondent recently laid down a crisp point of view on the economics here that goes like this:[AR] Shrink-wrap and enterprise license sales are slowly going away. Everything is moving to the cloud. ... All the major cloud providers package open source technologies in such a way th at they can reimplement large chunks of it with closed source versions. The "Hadoop" you get in the cloud is not the Hadoop you download from Apache even though it may start that way. The cloud makes it easy to look like open source without actually being open source. I've mentioned it before here but a lot of open source software is slowly being killed by the cloud because cloud providers generate a lot of margin on TCO arbitrage. Most open source infrastructure software is badly architected and poorly implemented because so much of it started with the design of one person trying to solve a problem they were not experts at solving, which has punishing TCO implications at scale. So companies reimplement their own vastly more efficient versions of open source with a large disincentive toward open sourcing those reimplementations. If open source wants to remain relevant in the cloud, large swaths of it needs to seriously level-up its engineering game. The open source ecosystems that are being liberally adopted "as-is" in major cloud infrastructures are the handful where the quality is high enough that improvements from reimplementation would be marginal, e.g., PostgreSQL or LLVM. You may or may not agree with that commenter's position, but he is right that what you think you are running on your hardware and what is running under the same name on someone else's hardware need only be compatible at the protocol level, the rest is opaque. I'm not here to kick open source in the behind (far from it), but the critique is real and relevant. This means that the central question for LANGSEC is not one question but, as I count it, three: 1. What to teach? 2. How to teach? 3. How to test if the teaching took hold? The "What to teach?" part demands that you, for some value of you, come to some conclusions on what does a LANGSEC process or mindset or paradigm look like. As far as I know, you are reasonably close to an answer there, but you do have the difficulty that since LANGSEC is about a level of perfection not previously obtainable, it is a bit hard to forthrightly agree on the point where good enough is good enough. You need to teach developers how they can, themselves, know that they are or are not contributing to the world's inventory of weird machines. How do they know that they are "doing it right" or "doing the best they can"? Do we need to approach LANGSEC in a Red Team / Black Team way? For "How to teach?" the question has to do with the student body, so to speak. How do you reach the architects and the programmers of new protocols and the respective stacks? The need to get the class into the classroom is overwhelmingly urgent; just consider vehicle-to-vehicle (V2V) communications on the roadways where, soon enough, protocol negotiation will have to be present the way it has to be present in, say in TLS renegotiation -- and that flaw was in a security protocol! Tightly coupling the digital world with the physical world is no joke. If self-driving cars talking to each other to make joint decisions is not pressing, then the North American power grid, generally known as "the biggest and most complex machine in the world," will soon have similar issues. Five years ago, Kelly Ziegler calculated that patching a fully deployed Smart Grid would take an entire year to complete, largely because of the size of the per-node firmware relative to the available powerline bandwidth.[KZ] That means that an ounce of prevention is worth rather more than a pound of cure. As to "How to test if the teaching took hold?" the LANGSEC community needs to attend to that soon. Like it or not, compliance controls most security spend, and compliance regimes change but slowly. Something that can be operationalized by lesser lights than those here today has to appear if the LANGSEC ideas are to usefully spread. You need need to articulate some formalized set of development practices around "LANGSEC compliance." While this is all going on, I have a question to ask: Is cybersecurity permanently offense-dominant? DARPA's Cyber Grand Challenge may answer that question at this summer's DefCon when a completely robotic capture the flag (CTF) contest is held. My money is on the offense and offense dominance as a force of nature. If I'm right, then the selling point for LANGSEC is a relative one -- those who adopt what you teach make somebody else's code the code that is easier to attack. That is probably the best you can do in the medium term, but it is not nothing. There are said to be five phases of technical maturity: 1. You had an idea 2. You could actually make it work 3. You could convince a gullible friend to try it 4. People stopped asking you why you were doing what you do 5. Other people get asked why they aren't In my estimation, LANGSEC is at step 4, you are not getting much in the way of "Why on earth are you working on that?" You have a way to go before you get to step 5 and other people get asked why they can't be bothered to try. For that, you need some successes, particularly if anyone reading about them says to themselves "I could do that." There's never enough time. Thank you for yours. -------------------------------------------- references (in alphabetic order): [AR] xent.com/pipermail/fork/Week-of-Mon-20150427/065374.html [BJ] Bitcoin Miner for Websites; www.bitcoinplus.com/miner/embeddable [BS] "Should U.S. Hackers Fix Cybersecurity Holes or Exploit Them?" www.theatlantic.com/technology/archive/2014/05/should-hackers-fix-cybersecurity-holes-or-exploit-them/371197 [CGC] Cyber Grand Challenge, DARPA www.darpa.mil/cybergrandchallenge/ [CI] Chris Inglis, confirmed by personal communication, 2014 [CWC] "Banks Dreading Computer Hacks Call for Cyber War Council" www.bloomberg.com/news/print/2014-07-08/banks-dreading-computer-hacks-call-for-cyber-war-council.html [DG1] "Cybersecurity as Realpolitik," Black Hat, August 2014 geer.tinho.net/geer.blackhat.6viii14.txt [DG2] "The Undiscovered," ;login, April 2015 http://geer.tinho.net/fgm/fgm.geer.1504.pdf [EA] Eric Allman, "The Robustness Principle Reconsidered," queue.acm.org/detail.cfm?id=1999945 [HT] Trends, HTTP Archive, www.httparchive.org/trends.php [JQ] John Quarterman, personal communication, 2013 [KZ] Kelly Ziegler, "The Future of Keeping the Lights On," USENIX, 2010; static.usenix.org/events/sec10/tech/slides/ziegler.pdf [LL] Lawrence Lessig, _Code: And Other Laws of Cyberspace_, Basic Books, 2000 [MO] Michael Osborne, Cambridge University, personal communication, 2015 [NIST] High Integrity Software System Assurance, section 4.2, hissa.nist.gov/chissa/SEI_Framework/framework_16.html, but you'll have to look in the Internet Archive for it [PEW] www.pewinternet.org/2014/07/03/net-threat [RD] "Surviving on a Diet of Poisoned Fruit; Reducing the National Security Risks of America's Cyber Dependencies," 2014 www.cnas.org/surviving-diet-poisoned-fruit [RL] Rob Lemos, personal communication, 2014 [SC] Sandy Clark, et al., "The Honeymoon Effect and the Role of Legacy Code in Zero-Day Vulnerabilities," ACSAC, 2010 www.acsac.org/2010/openconf/modules/request.php?module=oc_program&action=view.php&a=&id=69&type=2 [SFI] "Optimality vs. Fragility: Are Optimality and Efficiency the Enemies of Robustness and Resilience?" www.santafe.edu/gevent/detail/business-network/1665 [TH] thehill.com/policy/cybersecurity/242110-fbi-official-data-breaches-increasing-substantially -------------------------------------------- This and other material on file at http://geer.tinho.net/pubs