Friday, April 25, 2008

Patch obfuscation etc.

So it seems the APEG paper is getting a lot of attention these days, and some of the conclusions that are (IMO falsely) drawn from it are:
  • patch time to exploit is approaching zero
  • patches should be obfuscated
Before I go into details, a short summary of the paper:
  1. BinDiff-style algorithms are used to find changes between the patched and unpatched version
  2. The vulnerable locations are identified.
  3. Constraint formulas are generated from the code via three different methods:
    1. Static: A graph of all basic blocks on code paths between the vulnerability and the data input into the application is generated, and a constraint formula is generated from this graph.
    2. Dynamic: An execution trace is taken, and if the vulnerability occurs on a program path that one can already execute. Constraints are generated from this path.
    3. Dynamic/Static: Instead of going from data input to target vulnerability (as in the static approach), one can use an existing path that comes "close" to the vulnerability as starting point from which to proceed with the static approach.
  4. The (very powerful) solver STP is used for solving these constraint systems, generating inputs that exercise a particular code path that triggers the vulnerability.
  5. A number of vulnerabilities are discussed which were successfully triggered using the methods described in the paper
  6. The conclusion is drawn that within minutes of receiving a patch, attackers can use automatically generated exploits to compromise systems.
In essence, the paper implements automated input crafting. The desire to do this has been described before -- Sherri Sparks' talk on "Sidewinder" (using genetic algorithms to generate inputs to exercise a particular path) comes to mind, and many discussions about generating a SAT problem from a particular program path to be fed into a SAT solver (or any other solver for that matter).

What the APEG paper describes is impressive -- using STP is definitely a step forwards, as it appears that STP is a much superior solver to pretty much everything else that's publically available.

It is equally important to keep the limitations of this approach in mind - people are reacting in a panicked manner without necessarily understanding what this can and cannot do.
  1. Possible NP-hardness of the problem. Solving for a particular path is essentially an instance of SAT, and we know that this can be NP-hard. It doesn't have to be, but the paper indicates many formulas STP cannot solve in reasonable time. While this doesn't imply that these formulas are in fact hard to solve, it shows how much this depends on the quality of your solver and the complexity of the formulas that are generated.
  2. The method described in the paper does not generate exploits. It triggers vulnerabilities. Anyone who has worked on even a moderately complex issue in the past knows that there is often a long and painful path between triggering an overflow and making use of it. The paper implies that the results of APEG are immediately available to compromise systems. This is, plainly, not correct. If APEG is successful, the results can be used to cause a crash of a process, and I refuse to call this a "compromise". Shooting a foreign politician is not equal to having your intelligence agency compromise him.
  3. Semantic issues. All vulnerabilities for which this method worked were extremely simple. The actual interesting IGMP overflow Alex Wheeler had discovered, for example, would not be easily dealt with by these methods -- because program state has to be modified for that exploit in a non-trivial way. In essence, a patch can tell you that "this value YY must not exceed XX", but if YY is not direct user data but indirectly calculated through other program events, it is not (yet) possible to automatically set YY.
So in short one could say that APEG will succeed in triggering a vulnerability if the following conditions are met:
  1. The program path between the vulnerability and code that one already knows how to execute is comparatively simple
  2. The generated equation systems are not too complex for the solver
  3. The bug is "linear" in the sense that no complicated manipulation of program state is required to trigger the vulnerability
This is still very impressive stuff, but it reads a lot less dramatic than "one can generate an exploit automatically from an arbitrary patch". All in all, great work, and I do not cease to be amazed by the results that STP has brought to code analysis in general. It confirms that better solvers ==> better code analysis.

What the paper gets wrong IMO are the conclusions about what should be done in the patching process. It argues that because "exploits can be generated automatically, the patching process needs fixing". This is a flawed argument, as ... uhm ... useful exploits can't (yet) be generated automatically. Triggering a vulnerability is not the same as exploiting it, especially under modern operating systems (due to ASLR/DEP/Pax/GrSec).

The paper proposes a number of ways of fixing the problems with the current patching process:

1. Patch obfuscation. The proposal that zombie-like comes back every few years: Let's obfuscate security patches, and all will be good. The problems with this are multifold, and quite scary:
    1. Obfuscated executables make debugging for MS ... uhm ... horrible, unless they can undo it themselves
    2. Obfuscated patches remove an essential liberty for the user: The liberty to have a look at a patch and make sure that the patch isn't in fact a malicious backdoor.
    3. We don't have good obfuscation methods that do not carry a horrible performance impact.
    4. Obfuscation methods have the property that they need to be modified whenever attackers break them automatically. The trouble is: Nobody would know if the attackers have broken them. It is thus safe to assume that after a while, the obfuscation would be broken, but nobody would be aware of it.
    5. Summary: Obfuscation would probably a) impact the user by making his code slower and b) impact the user by disallowing him from verifying that a patch is not malicious and c) create support nightmares for MS because they will have to debug obfuscated code. At the same time, it will not provide long-term security.
2. Patch encryption: Distributing encrypted patches, and then finally distributing the encryption key so all systems update at once. This proposal seems to assume that bandwidth is the limiting factor in patch installation, which, as far as I can tell, it is not. This proposal does less damage than obfuscation though -- instead of creating certain disaster with questionable benefit, this proposal just "does nothing" with questionable benefit.

3. Faster patch distribution. A laudable goal, nothing wrong with this.

Anyhow, long post, short summary: The APEG paper is really good, but it uses confusing terminology (exploit ~= vulnerability trigger) which leads to it's impact on patch distribution being significantly overstated. It's good work, but the sky isn't falling, and we are far away from generating reliable exploits automatically from arbitrary patches. APEG does generate usable vulnerability triggers for vulnerabilities of a certain form. And STP-style solvers are important.
I have not been blogging nor following the news much in recent months, as I am frantically trying to get all my university work sorted. While I have been unsuccessful at getting everything sorted at the schedule I had set myself, I am making progress, and expect to be more visibly active again in fall.

Today, I found out that my blog entry on the BlueHat blog drew more feedback than I had thought. I am consistently surprised that people read the things that I write.

Reading my blog post again, I find it so terse I feel I have to apologize for it and explain how it ended up this way. It was the last day of Bluehat, and I was very tired. Those that know me know me well know that my sense of humor is difficult at the best of times. I have a great talent of sounding bitter and sarcastic when in fact I am trying to be funny and friendly (this had lead to many unfortunate situations in my life :-). So I sat down and tried to write a funny blog post. I was quite happy with it when it was done.

In an attack of unexpected sanity, I decided that someone else should read over the post, so I asked Nitin, a very smart (and outrageously polite) MS engineer. He read it, and told me (in his usual very polite manner) ... that the post sucked. I have to be eternally thankful to him, because truly, it did. Thanks Nitin !

So I deleted it, and decided that writing down just the core points of the first post. I removed all ill-conceived attempts at humor, which made the post almost readable. It also limited the room for potential misunderstandings.

I would like to clarify a few things that seem to have been misunderstood though:

I did not say "hackers have to" move to greener pastures. I said "hackers will move to greener pastures for a while". This is a very important distinction. In order to clarify this, I will have to draw a bit of a larger arc:

Attackers are, at their heart, opportunists. Attacks go by the old basketball saying about jumpshot technique: "Whoever scores is right". There is no "wrong" way of compromising a system. Success counts, and very little else.

When attackers pick targets, they consider the following dimensions:
  • Strategic position of the target. I will not go into this (albeit important) point too deeply. Let's just assume that, since we're discussing Vista (a desktop OS), the attacker has made up his mind and wishes to compromise a client machine.
  • Impact by market share: The more people you can hack, the better. A widely-installed piece of software beats a non-widely installed piece of software in most cases. There's many ways of doing this (Personal estimates, Gartner reports, internet-wide scans etc.).
  • Wiggle Room: How many ways are there for the attacker to interact with the software ? How much functionality does the software have that operates on potentially attacker-supplied data ? If there are many ways to interact with the application, the odds of being able to turn a bug into a usable attack are greatly increased, and the odds of being able to reach vulnerable code locations are greatly increased. Perhabs the more widely used term is "attack surface", but that term fails to convey the importance of "wiggle room" for exploit reliability. Any interaction with the program is useful.
  • Estimated quality of code: Finding useful bugs is actually quite time consuming. With some experience, a few glances at the code will give an experienced attacker some sort of "gut feeling" about the overall quality of the code.
From these four points, it is clear why IE and MSRPC got hammered so badly in the past: They pretty much had optimal scores on Impact -- they were everywhere. They provided plenty of "Wiggle Room": IE with client-side scripting (yay!), MSRPC through the sheer number of different RPC calls available. The code quality was favourable to the attacker up until WinXP SP2, too.

MS has put more money into SDL than most other software vendors. This holds true both in absolute and in relative terms. MS is in a very strong position economically, so they can afford things other vendors (who, contrastingly, are exposed to market forces) cannot.

The code quality has improved markedly, decreasing the score on the 4th dimension. Likewise, there has been some reduction in attack surface, decreasing the score on the 3rd dimension. This is enough to convince attackers that their time is better spent on 'weaker' targets. The old chestnut about "you don't have to outrun the bear, you just have to outrun your co-hikers" holds true in security more than anywhere else.

In the end, it is much more attractive to attack Flash (maximum score on all dimensions) or any other browser plugins that are widely used.

I stand by my quote that "Vista is arguably the most secure closed-source OS available on the market".

This doesn't mean it's flawless. It just means it's more secure than previous versions of Windows, and more secure than OS X.

There was a second part to my blog post, where I mentioned that attackers are waiting for MS to become complacent again. I have read that many people inside Microsoft cannot imagine becoming complacent on security again. While I think this is true on the engineering level, it is imaginable that security might be scaled down by management.

The sluggish adoption of Vista by end-users is a clear sign that security does not necessarily sell. People buy features, and they cannot judge the relative security of the system. It is thus imaginable that people concerned with the bottom line decide to emphasize features over security again -- in the end, MS is a business, and the business benefits of investing in making code more secure have yet to materialize.

We'll see how this all plays out :-)

Anyhow, the next BlueHat is coming up. I won't attend this time, but I am certain that it will be an interesting event.

Wednesday, April 02, 2008

My valued coworker, SP, has just released his "pet project", Hexer. Hexer is a platform-independent Java-based extendible hex editor and can be downloaded under http://www.zynamics.com/files/Hexer-1_0_0.rar

It's also a good idea to visit his blog where he'll write more about it's features and capabilities.

Tuesday, April 01, 2008

Oh, before I forget: Ero & me will be presenting on our work on structural malware classification at RSA next week. If anyone wishes to schedule a meeting/demo of any of our things (VxClass/BinDiff/BinNavi), please do not hesitate to contact info@zynamics.com.


Some small eye candy: The screenshot shows BinNavi with our intermediate representation (REIL) made visible. While REIL is still very beta-ish, it should be a standard (and accessible) part of BinNavi at some point later this year.

Having a good IR which properly models side effects is a really useful thing to have: The guys over at the BitBlazer project in Berkeley have shown some really useful things that can be done using a good IR and a good constraint solver :-). I am positively impressed by several papers they have put out.

I also can't wait to have more of this sort of stuff in BinNavi :-).
Conspiracy theory of the day:

As everyone, I am following the US primaries, and occasionally discussing with my brother on the implications of the developments for the wider world. My brother is usually good for quite some counter-intuitive insights into things, and described to me a "conspiracy theory" that I find amusing/interesting enough to post here.

Please be aware that the following is non-partisan: I do not really have an idea on whether I'd prefer Mrs Clinton, Mr Obama or Mr McCain in the white house, and this is not a post that is intended to weigh in on either side.

I was a bit puzzled on why Mrs Clinton is still in the primary race even though her mathematical odds on winning the democratic nomination seem slim. The conspiracy theory explaining this is the following:

The true goal now for Mrs Clinton is now 2012, not 2008. If Mr Obama wins the nomination _and_ the presidency, Mrs Clinton will very likely not become president in her lifetime. On the other hand: If she manages to damage Mr Obama bad enough so that Mr McCain enters the white house, she has good cards to win the democratic nomination in 2012, and Mr McCain is unlikely to stay a second term (given his age).

It's an interesting hypothesis. Anyhow, I should really get to sleep.

Tuesday, March 11, 2008

A short real-life story on why cryptography breaks:

One of the machines that I am using is a vhost hosted at a german hosting provider called "1und1". Clearly, I am accessing this machine using ssh. So a few weeks ago, to my surprise, my ssh warned me about the host key having changed.

Honored by the thought that someone might take the effort to mount a man-in-the-middle attack for this particular box, my rational brain told me that I should call the tech support of the hosting provider first and ask if any event might've lead to a change in keys.

After a rather lengthy interaction with the tech support (who first tried to brush me off by telling me to "just accept the new key"), I finally got them to tell me that they upgraded the OS and that the key had changed. After about 20 minutes of discussion, I finally got them to read the new key to me over the phone, and all was good.

Then, today, the warning cropped up again. I called tech support, a bit annoyed by these frequent changes. My experience was less than stellar - the advice I received was:
  1. "Just accept the new key"
  2. "The key is likely going to change all the time due to frequent relocations of the vhost so you should always accept it"
  3. "No, there is no way that they can notify me over the phone or in a signed email when the key changes"
  4. "It is highly unlikely that any change that would notify you would be implemented"
  5. "If I am concerned about security, I should really buy an SSL certificate from them" (wtf ??)
  6. "No, it is not possible to read me the key fingerprint over the phone"
The situation got better by the minute. After I told them that last time the helpful support had at least read me the fingerprint over the phone, the support person asked how I could be sure that my telephone call hadn't been man-in-the-middled...

I started becoming slightly agitated at this point. I will speak with them again tomorrow, perhabs I'll be lucky enough to get to 3rd-level-support instead of 2nd level. Hrm. As if "customer service" is a computer game, with increasingly difficult levels.

So. Summary: 1und1 seems to think crypto is useless and we should all use telnet. Excellent :-/

Friday, March 07, 2008


Hey all,

we have released BinNavi v1.5 last week. Normally, I'd write a lot of stuff here about the new features and all, but this will have to wait for a few days -- I am very tied up with some other work.

With the v1.5 release, we have added disassembly exporters that export from both OllyDbg and ImmunityDbg to our database format -- this means that Navi can now use disassemblies generated from those two debuggers, too. The screenshot above is BinNavi running on Ubuntu with a disassembly exported from the Windows VW into which we are debugging.

Anyhow, the real reason for this post is something completely different: We don't advertise this much on our website, but our tools are available in a sort of 'academic program':

If you are currently enrolled as a full-time-student at a university and have an interesting problem you'd like to use our tools for, you can get a license of our tools (Diff/Navi) for a very moderate amount of money. All you have to do is:
  • Contact us (info@zynamics.com) with your name/address/university etc.
  • Explain what project you'd like to work on with our tools
  • Sign an agreement that you will write a paper about your work (after it's done) that we can put on our website
Oh, and you of course have to do the work then and write the paper :-)
Anyhow, I have to get back to work. Expect more posts from me later this year -- things are very busy for me at the moment.

Cheers,
Halvar

Tuesday, February 12, 2008

Hey all,

We will be releasing BinNavi v1.5 next week -- and I can happily say that we will have
many cool improvements that I will blog about next week, once it is out.

Pictures often speak louder than words, so I'll post some of them here:

http://www.zynamics.com/files/navi15.1.png
http://www.zynamics.com/files/navi15.2.png
http://www.zynamics.com/files/navi15.3.png
http://www.zynamics.com/files/tree_lookup.jpg

A more detailed list of new features will be posted next week.

VxClass is making progress as well -- but more on this next week.

If there's anyone interested in our products (BinDiff, BinNavi, VxClass)
in the DC area, I should be free to meet & do a presentation on the products
next week.

Cheers,
Halvar

Tuesday, January 08, 2008

Happy new year everyone.

In June 2006 Dave Aitel wrote on Dailydave that "wormable bugs" are getting rarer. I think he is right, but this month's patch tuesday brings us a particularly cute bug.

I have created a small shockwave film and uploaded it to
http://www.zynamics.com/files/ms08001.swf

Enjoy ! :-)

On other news: We'll be posting screenshots of BinNavi v1.5 (due out in February) and the current VxClass version in the next two weeks - they are coming along nicely.

Cheers,
Halvar

Sunday, October 07, 2007

Our trainings class in Frankfurt is over, and I think I can safely say that it was a resounding success. I guess the coolest thing about SABRE is our customers. I hope to see you all again someplace again.

PS: I forgot to distribute the python code from the last day, it will be mailed to all participants on monday.

Monday, September 24, 2007

Blackhat Japan

After the immigration SNAFU in summer, I am scheduled to give my trainings class at Blackhat Japan this November - so if anyone wants to come, sign up now :-)

Cheers,
Halvar

Tuesday, September 04, 2007

BinDiff v2.0 finally released !

This is "blog-spam":

After a long wait, SABRE Security GmbH is proud to announce
the official release of BinDiff v2.0. This biggest improvements are:
  • Higher comparison speeds
  • Greater accuracy for functions which change only in the structure of the graph, not in the number of nodes/edges
  • Much greater accuracy on the instruction level comparison
  • The arguably prettiest UI of all binary comparison tools around
The many detail improvements are too numerous to mention here.
Check the screenshots:





Contact info@sabre-security.com for an evaluation version !

-- SABRE Security Team

Saturday, August 04, 2007

I am quite famous for botching every marketing effort that we try to undertake at SABRE -- a prime example of my ineptitude is the fact that we released BinNavi v1.2 in ... uh ... January, with a ton of new stuff, and I still hadn't updated the website to show some nice pictures.

Similarly for BinDiff -- v2.0 beta has been used by many customers without a hitch, and is a big improvement on the UI front. So I finally got around to adding some nice pictures today.

Also, for those that are into the entire idea of malware classification, you can see some screenshots of VxClass, our unpacker-and-classifier (Disclosure: Before Spender writes a comment ;) about our unpacker's inability to handle TheMida and similar emulating packers, I will do so myself: We do not handle emulating packers at the moment! We do not reconstruct PEs ! But if you have a cool unpacker you can just upload the unpacked file to our classifier :)

So with this blog post it's confirmed: I am not only a failure at marketing, I am also a failure at attempting to pass off marketing as a regular blog post. Have a good weekend everyone !

Thursday, August 02, 2007

I have reached the intellectual level of the sports spectator in an armchair: Comment first, read and understand later. After the last Blog comment, I actually went to read the slides of Joanna's presentation. To summarize: I find the slides informative and well-thought-out. I found that the empirical bits appear plausible and well-researched. The stuff following slide 90 was very informative. It is one of the most substantial slide decks I have read in recent times.

Some points to take home though: Whoever writes a rootkit puts himself in a defending positions. Defending positions against all known attacks is possible given perfection on the side of the defender. That is bloody hard to achieve. There is no doubt that for any given attack one can think of a counter attack, but it's a difficult game to play that doesn't allow for errors.

I think the core point that we should clarify is that rootkits should not fall into an adversary's hand to be analyzed. Once they are known, they fall into a defending position. Defending positions are not long-term substainable, as software has a hard time automatically adapting to new threats.

Once you accept that the key to a good rootkit is to use methods unknown to the victim, one might also be tempted to draw the conclusion that perhabs the virtualisation stuff is too obvious a place to attempt to hide in. But that is certainly open to discussion.

Enough high-level blah blah. I am so looking forwards to my vacation, it's not funny.
Post veröffentlichen
So it appears the entire Rutkowska-Matasano thing is not over yet. I probably should not harp on about this in my current mood, but since I am missing out on the fun in Vegas, I'll be an armchair athlete and toss some unqualified comments from the sidelines. Just think of me as the grumpy old man with a big gut and a can of beer yelling at some football players on television that they should quit being lazy and run faster.

First point: The blue chicken defense outlined in the linked article is not a valid defense for a rootkit. The purpose of a rootkit is to hide data on the machine from someone looking for it. If a rootkit de-installs itself to hide from timing attacks, the data it used to hide either has to be removed or is no longer hidden. This defeats the purpose of the rootkit: To hide data and provide access to the compromised machine.

Second point: What would happen if a boxer who claims the ability to defeat anyone in the world would reject any challengers unless they pay 250 million for him to fight ? Could he claim victory by telling the press that he "tried out all his opponents punches, and they don't work, because you can duck them like this and parry them like that" ?
I think not.

I am not saying it's impossible to build a rootkit that goes undetected by Matasano's methods. But given access to the code of a rootkit and sufficient time, it will be possible to build a detection for it. Of course you can then change the rootkit again. And then the other side changes the detection. And this goes on for a few decades.

Could we please move on to more fruitful fields of discussion already ?

Tuesday, July 31, 2007

Some people in the comments of my blog have hinted that I should have just "followed the rules" and nothing would have happened. This is incorrect -- I did follow the rules. It is perfectly legal for an independent contractor to be contracted to perform a task in the US, come in, do it, and leave. That is (amongst other things) what the "business" checkbox on the I94W is for.

What landed me in this trouble is that the immigration agent decided that even though I am CEO of a company in Germany and have no employment contract with Blackhat (just a contract as an independent contractor), that the status of "independent contractor" does not apply to me - his interpretation was that I was an "employee" of Blackhat without an H1B visa.

This is not a case of me screwing up my paperwork. This is a case of an immigration agent that did not understand my attempts at explaining that I am not a Blackhat employee, and me not knowing the subtleties of being interviewed by DHS/INS agents.

I hope I will be able to clarify the misunderstanding on Thursday morning at the consulate.
=============================
Small addition to clarify: It is perfectly legitimate to come to the US to hold lectures and trainings of the kind that I am holding at Blackhat. To reiterate: The problem originated solely from a misunderstanding where it was presumed I was an "employee" of a US company, which is not correct.

Sunday, July 29, 2007

Short update: I have managed to schedule a hearing for a regular visa. The first available date was the 24th of August *cough*.

While this is clearly too late for Blackhat, but once you have a "regular" meeting scheduled you can ask to have an "urgent" meeting scheduled, too. Wether I am eligible will become clear when the embassy opens at 7am on monday morning.

The current plan is to call them and explain them why the entire thing might've gone haywire in the first place:

There's a special provision in the german tax code that allows for people with certain qualifications to act as special 'freelancers', essentially giving them a status very similar to one-person-companies ("Freiberufler"). It is not totally trivial to obtain this status - for example, you cannot simply be a 'Freiberuf'-programmer if you write "regular" software.

My agreement with Blackhat and all transactions were taxed in Germany under this status.

Personally, I think the fundamental issue in this tragic comedy is that the US doesn't really have such a special status for freelancers, and that therefore the US customs inspector did not understand that there is a distinction between a "regular Joe" and a "single-person company/Freiberufler". Hence the customs officer assumed that this entire thing must be some devious way to bypass getting an H1B visa for someone that would not normally qualified to get one. The frequent repetition of the question "why is your course not given by an American Citizen ?" points to something like that.

I hope that I can clear up this misunderstanding tomorrow morning, but right now, I am not terribly optimistic.
I've been denied entry to the US essentially for carrying my trainings material. Wow.

It appears I can't attend Blackhat this year. I was denied entry to the US for carrying trainings materials for the Blackhat trainings, and intending to hold these trainings as a private citizen instead of as a company.

After a 9-hour flight and a 4 1/2 hour interview I was put onto the next 9-hour flight back to Germany. Future trips to the US will be significantly more complicated as I can no longer go to the US on the visa waiver program.

A little background: For the last 7 years, I have attended / presented at the 'Blackhat Briefings', a security conference in the US. Prior to the conference itself, Blackhat conducts a trainings session, and for the past 6 years, I have given two days of trainings at these events. The largest part of the attendees of the trainings are US-Government related folks, mostly working on US National Security in some form. I have trained people from the DoD, DoE, DHS and most other agencies that come to mind.

Each time I came to the US, I told immigration that I was coming to the US to present at a conference and hold a trainings class. I was never stopped before.

This time, I had printed the materials for the trainings class in Germany and put them into my suitcase. Upon arrival in the US, I passed immigration, but was stopped in customs. My suitcase was searched, and I was asked about the trainings materials.
After answering that these are for the trainings I am conducting, an immigration officer was called, and I was put in an interview room.
For the next 4 1/2 hours I was interviewed about who exactly I am, why I am coming to the US, what the nature of my contract with Blackhat is, and why my trainings class is not performed by an American citizien. After 4 hours, it became clear that a decision had been reached that I was to be denied entry to the US, on the ground that since I am a private person conducting the trainings for Blackhat, I was essentially a Blackhat employee and would require an H1B visa to perform two days of trainings in the US.

Now, I am a full-time employee (and CEO) of a German company (startup with 5 people, self-financed), and the only reason why the agreement is between Blackhat and me instead of Blackhat and my company is that I founded the company long after I had started training for Blackhat and we never got around to changing it.

Had there been an agreement between my company and Blackhat, then my entry to the US would've been "German-company-sends-guy-to-US-to-perform-services", and everything would've been fine. The real problem is that the agreement was still between me as a person
and Blackhat.

After the situation became clear (around the 4th hour of being interviewed), I offered that the agreement between Blackhat and my company could be set up more or less instantaneously - as a CEO, I can sign an agreement on behalf of my company, and Blackhat would've signed immediately, too.
This would've spared each party of us a lot of hassle and paperwork. But apparently, since I had just tried to enter as a 'normal citizen' instead as an 'employee of a company', I could now not change my application. They would have to put me on the next flight back to Germany.

Ok, I thought, perhabs I will have to fly back to Germany, set up the agreement, and immediately fly back to the states - that would've still allowed me to hold the trainings and attend the conference, at the cost of crossing the Atlantic three times instead of once. But no such luck: Since I have been denied entry under the visa waiver programme, I can now never use this programme again. Instead I need to wait until the American consulate opens, and then apply for a business visa. I have not been able to determine how long this might take -- estimates from customs officials ranged from "4 days" to "more than 6 weeks".

All this seems pretty crazy to me. From the point that 2 days of trainings constitute work that requires an H1B visa, via the issue that everything could've been avoided if I had been allowed to set up the agreement with Blackhat immediately, to the fact that setting up the agreement once I am back in Germany and flying in again is not sufficient, all reeks of a bureacracy creating work for itself, at the expense of (US-)taxpayer money.

I will now begin the Quixotic quest to get a business visa to the US. Sigh. This sucks.

Thursday, July 12, 2007

The Core guys have published a paper on a very cute heap visualisation tool.

What shall I say ? I like it, and we'll play a lot more chess with memory in the future.

Saturday, July 07, 2007

It seems that this country is spinning out of control. We barely have the economy back on track, and now our interior minister is fighting ghosts with flamethrowers:

This link refers to an interview with him where he proclaims that:
  • Germany should create the status of 'enemy combatant' and allow interning 'dangerous elements'
  • The 'targeted killing of suspects' is not in discord with our constitution, but a 'legal problem' that hasn't been 'fully clarified'
I have to admit that while I was critical about the fact that the Bush-Administration skipped due process and a host of other essential liberties in the Guantamo/Black Interrogation Sites affair, I was not all-too-concerned -- after all, after the next election the entire thing would've been rolled back and similar madness made impossible for the next n years. I am quite shocked that our interior minister, in desparate need for some agenda, would like to outdo the Bush Administration exactly at a point in time where these policies should be thoroughly discredited.

Time to write a letter to the representative in the german congress...sigh....

Wednesday, June 13, 2007

MS07-031

We're close to finally releasing SABRE BinDiff v2.0, and I've posted a small movie showing how it can be used to analyze MS07-031 here. Enjoy !

Friday, April 27, 2007

Microsoft seems to consider banning memcpy(). This is an excellent idea - and along with memcpy, malloc() should be banned. While we are at it, the addition and multiplication operators have caused so much grief over the last years, I think it would make total sense to ban them. Oh, and if we ban the memory dereference, I am quite sure we'd be safe.

Banning API calls is not the same as auditing code. Auditing is not supergrep. Sigh.

And "we fuzzed, but it was wrapped in an exception handler" is crazy talk. The debugger gets first notification of any exception, before the exception handler - if you are fuzzing without noting down all the exceptions that occur, you're living in ... uhm ... 2001 ?

But either way: The problem is that people think Vista will be "safe", in absolute terms, which
is false. Vista is "safer", e.g. a number of bugs won't be useful any more. Because of the false perception of Vista being "safe", some people are now disappointed (because of ANI).

Enough ranting. Everybody take a deep breath, relax, and watch the game as OS X gets owned badly for the next two years.

Friday, March 23, 2007

Can someone explain me why there is so few decent java decompilers out there ? Yes, JAD does a decent job in many cases, but sometimes simple control flow confuses it and the reconstruction is less than accurate. JODE is sometimes better in that regard, but fails on a good number of files, and also does not seem to assign new variable names based on the types of the variables.

With all that Java code on my cellphone, it's slightly annoying that it's so difficult to get a decent decompile. I mean, once I have that I can work in eclipse and refactor the class/variable names until I am happy.

Then again, it seems Java decompilers were all the rage in 1997-2002, and nowadays few people seem to be developing them...

Wednesday, February 21, 2007

I will be at Blackhat Federal in Washington DC next week, and since I am not giving a talk, I will have some free time to chat :-)

If anybody in the Washington DC area would like to meet and / or have our products demo'ed, please drop me a mail at halvar.flakeXnospamX@sabre-security.com.

Cheers,
Halvar

Monday, February 05, 2007

I would like to use this blog to make the MD5Sum and the SHA1sum of a certain file public:

MD5Sum:
5e5ed3b92b2abbcc1adaa18cc0ca6aaf

SHA1sum:
FFECBE21E3EC93A5AC2B94889AD2967881398A9C

Cheers,
Halvar

Thursday, January 18, 2007

One of the most amusing new features of BinNavi in the v1.2 release is the GDB agent. FX (of SABRE Labs fame) worked hard to create a proxy that sits in-between BinNavi GUI and something speaking GDB serial protocol either via a serial line or via TCP.

Now, what is this good for ?

First of all, it allows one to use BinNavi's debugging capabilities on platforms that we do not explicitly support (if a recent GDB version works on it). This means most *NIX variants. Let's say, for some reason, you have a FreeBSD system on which you'd like to debug some piece of software, and BinNavi does not come with a FreeBSD debugger. But GDB runs on FreeBSD - so you just run your target under gdbserver and use the BinNavi GDB agent via TCP to transparently debug the target.

Now, using BinNavi on more-or-less arbitrary *NIX systems is nice, but the real joy lies elsewhere: FX made sure that the debugging proxy does not only speak the GDB protocol as spoken by GDB itself, but also the variants spoken by Cisco IOS and ScreenOS.

This makes reverse engineering embedded systems that speak either regular GDB protocol or one of the supported variants a blast: In the past, we had to proceed as follows:
  1. Get a ROM image from somewhere
  2. Stare at the image to figure out methods to decompress it properly
  3. Once this was achieved, load the image into IDA and use switch()-constructs to determine the proper loading address of the image
  4. Load the image into IDA again, this time at the correct address
Of course, live-debugging was usually out of the question.
With the BinNavi GDB Agent, we can now do the following:
  1. Attach the device to a serial port and set it into GDB mode
  2. Read & dump the memory from the current instruction pointer backwards until the device freezes
  3. Read & dump the memory forwards from the current instruction pointer until the device freezes
  4. Load the result into IDA and export the disassembly into BinNavi
  5. Do live-debugging on the device in question :-)
So, as an exercise, we took a Netscreen-VPN5 we had acquired via Ebay. Unfortunately, it did not come with a support contract, so we could not get software images to disassembly. So we set the device into GDB mode by typing "set gdb enable" in the console, and connected:

C:\BinNavi.v1.2\gdbagent>gdbcmd COM1,9600 NS5XT
Connected via \\.\COM1 (baud=9600 parity=N data=8 stop=1) to Netscreen 5XT Agent
/ PowerPC

[q] quit | [r] Registers | [c] Continue | [R] Reset | [b] Breakpoint
[s] step | [m] Read Memory | [D] Detach | [d] Dump Memory Range


Reading Registers ... done

GPR0 = 1
GPR1 = 350f958
GPR2 = aecce8
GPR3 = ffffffffffffffff

GPR4 = 2e
GPR5 = 0
GPR6 = 0

GPR7 = 0
GPR8 = d55e70
GPR9 = ae0000
GPR10 = d50000

GPR11 = d50000
GPR12 = 40000024
GPR13 = 0
GPR14 = 0
GPR15 = 0
GPR16 = 0

GPR17 = 40140130
GPR18 = 0
GPR19 = 186ac40

GPR20 = 0
GPR21 = 350ff78
GPR22 = 186ac4e
GPR23 = ffffffffffffffff

GPR24 = 0
GPR25 = 0

GPR26 = 0
GPR27 = 0
GPR28 = 186ac40
GPR29 = 0

GPR30 = 186a910
GPR31 = ae5684
(...)
PC = 6826c
MSR = 29230
CR = 40000028

LR = 67c10
CTR = 249b30
XER = 20000002


The program counter is set to 0x6826c, and thus we know: Some code is mapped at 0x6826c. It is a pretty safe bet that all code will be consecutive in memory, sow we will now dump the memory forwards and backwards from this address: We type "d" in the command line and enter the base address and the number of bytes (in hex) we want to dump:

Memory at: 68000
Size: 400000
Filename: 0x68000.0x400000.dmp


The agent now begins to read the memory off the device in chunks of 1024 bytes via 9600 baud serial port - so it is a good idea to go to lunch in the meantime. Once we're back from lunch, we reboot the NS5XT - it will have hung when it ran out of memory to dump. We set it back into debugging mode and dump the memory before offset 0x68000:

Memory at: 40000
Size: 28000

Filename: 0x40000.0x28000.dmp

We stitch the two files together end-to-end, load them into IDA and run a few small scripts to identify function entry points and do some minor fixing of the disassembly (principally switch statements, and some function naming), and export everything into the BinNavi database. We then open it as usual in BinNavi, open the callgraph and start browsing around.

On the left, we see a callgraph view of the device's IKE packet handlers (which we inferred from string references in the disassembly), plus the functions that are directly called by them.

Now, which of these functions would be executed when we run a round of ike-scan against the device ?

Clicking on the red button makes BinNavi talk to the BinNavi GDB agent to set one-time breakpoints on all functions in the graph on the left - due to the serial link, this is not blazingly fast, but after seconds, not minutes, we have breakpoints on all these functions. We then run ike-scan against the device, and click on "stop recording" again. The result is the list of functions from our graph that were executed - highlighted in the following pictures:













Clearly we can do the same on the function flowgraph level in, for example, the function labeled IKE_SA_Handler above. Generally, everything you can do with BinNavi on Win32 executables you can also do with BinNavi on the embedded device now: Record traces, set breakpoints, set Python callbacks on breakpoints, read memory, read registers etc. etc...

The following three screenshots show the function in question being debugged. The first screen shows the path that is executed on running an ike-scan against the device highlighted in red. The second screen shows BinNavi having suspended the execution on the basic block with the red/blue border (the blue border indicates a persistent breakpoint on the basic block, the red border indicates that execution is currently suspended on that block). The third screen just shows the registers and some memory of the device at this point in time.

So to sum things up: With the BinNavi GDB Agent, you can debug anything that speaks the GDB protocol more or less just as if it were a regular windows app (small caveat: You are speaking with most embedded devices via a serial port, oftentimes 9600 baud. You probably do not want to set 60.000 breakpoints at once - aside from the bandwidth consumption, it is common for the gdb server to handle only a limited number of breakpoints. In our tests, setting several hundreds was no problem). Extracting ROM images in a format that is easily disassembled is easy, and full on-device debugging helps a lot with all our favourite tasks:
  • understanding the code at hand
  • identifzing which functions are responsible for which features
  • hunting for security vulnerabilities
  • constructing input to reach vulnerable locations
Have a good week, I have some more reversing to do :)

Oh, and be sure to check out Ero Carrera's Blog - he will post about the SQL database format used by BinNavi at the end of next week, and show why it's useful and flexible.

Thursday, November 23, 2006

Over at the Matasano Blog :)

Matasano 's Blog quoted my post on Office bugs, and Ivan Arce made some excellent points in the comments:
1. 'They are inherently one-shot. You send a bad file, and while the user might try to open it multiple times, there is no way the attacker can try different values for anything in order to get control.”'

IA: OK. good point but…think about scale & diversity. Even in a targeted attack sending a one-shot client-side exploit against N desktop systems will with one hardcoded address will offset the value of ALSR with some probability of success for a given N. The attacker only needs ONE exploit instance to work in order to break into ONE desktop system, after that it is game over. Client-side bugs are one shot against the same system but not necesarrilly so against several systems in parallel.

Very true, I did overlook this. It also explains the use of really low-value phone-home bots as payload: If you're going to attack in such a "wide" manner, you essentially accept detection as long as you can compromise one of the relevant clients. This means that whatever you are sending will be lost, and therefore you won't send anything more sophisticated than a simple bot.

” 2. There can not be much pre-attack reconnaissance. Fingerprinting server versions is usually not terribly difficult (if time consuming), and usually one can narrow down the exact version (and most of the times the patch level) of a target before actually shooting valuable 0day down the wire. With client side bugs, it is a lot more difficult to know the exact version of a piece of software running on the other side - one probably has to get access to at least one document created by the target to get any data at all, and even this will usually be a rough guesstimate.”

IA: Hmmm not sure about this either. I would argue the desktop systems (clients) leak A LOT more information about themselves than servers and, generally, those leaks are much less controlled and/or controllable and easier to elicit than server leaks. After all, as a general principle, client apps are _designed_ to provide information about themselves.

Not to mention that a lot of information about your desktop systems has *already* leaked and is publicly available on the net now (server logs, emails, documents, stray packets, etc.), you just need to know how and where to look for it.

I disagree on this to an extent. My system leaks information about my mail client because I participate in public forums etc, but the majority of corporate users never gain any visibility outside of the internal network. Most people just don't use mailing lists or usenet etc. So it will be comparatively easy to attack some security officer (hey, I know his exact client version), but the CEO's secretary (which might be a lot more interesting as a target, and less likely to notice her computer is compromised) will be more or less "invisible".


Tuesday, November 21, 2006

Unbelievable but true

I am decompressing a bit after a few weeks of insane stress and thus I am actually reading blogs. And to my greatest surprise, I ended up reading this one. Now, Oracle security has never interested me ever since I tried to audit it in 2000 and it kept falling over without a fight (or without us really doing anything except sending a few letters to it), but I have to admit that Ms. Davidsons blog has a pretty high entertainment value (at least for me, a morallically degenerate piece of eurotrash full of the afterglow of a once good education system), AND it is refreshing to see someone with a bit of a classical education in IT security (I get picked upon regularly for the fact that I got my Latinum "on the cheap" and know jack shit about old greek - then again, my circle of friends includes a mathematician that claims that he can, by means of listening to a record, tell you in which church in france a certain piece of organ music was played, and hence I am always the loud and stupid one).

Anyhow, given Oracle's horrible code quality, I am very much positively surprised at the quality of Ms. Davidsons blog. And given what most people that have worked with static analysis tools before would describe as a horrible mistake in evaluating tool quality, I would like to mention that mathematics and geometry are part of a classical education. Whoever decided on the right source code analysis tool to use for detecting flaws in Oracle apparently failed that part.
Client Side Exploits, a lot of Office bugs and Vista

I have ranted before about careless use of 0day by seemingly chinese attackers, and I think I have finally understood why someone would use good and nice bugs in such a careless manner:

The bugs are going to expire soon. Or to continue using Dave Aitel's and my terminology: The fish are starting to smell.

ASLR is entering the mainstream with Vista, and while it won't stop any moderately-skilled-but-determined attacker from compromising a server, it will make client side exploits of MSOffice file format parsing bugs a lot harder.

Client-side bugs suffer from a range of difficulties:
  1. They are inherently one-shot. You send a bad file, and while the user might try to open it multiple times, there is no way the attacker can try different values for anything in order to get control.
  2. There can not be much pre-attack reconnaissance. Fingerprinting server versions is usually not terribly difficult (if time consuming), and usually one can narrow down the exact version (and most of the times the patch level) of a target before actually shooting valuable 0day down the wire. With client side bugs, it is a lot more difficult to know the exact version of a piece of software running on the other side - one probably has to get access to at least one document created by the target to get any data at all, and even this will usually be a rough guesstimate.
As a result of this, client-side bugs in MSOffice are approaching their expiration date. Not quickly, as most customers will not switch to Vista immediately, but they are showing the first brown spots, and will at some point start to smell.

So you're in a situation where you're sitting on heaps of 0day in MSOffice, which, contrary to Vista, was not the biggest (private sector) pentest ever (This sentence contains two inside jokes, and I hope that those who understand them aren't mad at me :-). What do you do with those that are going to be useless under ASLR ? Well, damn, just fire them somewhere, with some really silly phone-home-bots inside. If they bring back information, fine, if not, you have not actually lost much. The phone-home bots are cheap to develop (in contrast to a decent rootkit) and look amateurish enough as to not provoke your ambassador being yelled at.

If you are really lucky, you might actually get your opponent to devote time and resources to countermeasures against MS Office bugs, in the hope they don't realize that work will be taken care of elsewhere. In the meantime, you hone your skills in defeating ASLR through out-of-defined-memory-read-bugs (see some blog post in the next few days).

On a side note, I am terribly happy today. I've had more luck this week than I deserve.

Monday, November 20, 2006

While we're all talking about the next overflow and think that they have significance in the wider scheme of things, I'll climb on the soapbox for 5 minutes:

We should send peacekeeping troops to Darfour/Sudan. I was strongly opposed to the Iraq war (on the ground that invasion would bring civil war), but I plead my government: Take my taxes and send peacekeeping forces to Sudan. _If_ we have decided that the 'europeans-are-from-venus'-stance is obsolete, we have here a primary example of a conflict where external invasion appears necessary according to almost everybody (except the government in Kartoum).

Thursday, October 05, 2006

While I am blogging about strange hobbies: I used to draw a lot, and still appreciate a few comics. Most importantly, local cult hero Jamiri.

Some examples:
http://www.spiegel.de/netzwelt/netzkultur/0,1518,grossbild-650193-422928,00.html

http://www.spiegel.de/netzwelt/netzkultur/0,1518,grossbild-669475-427889,00.html
I am known for odd hobbies and interests, and for a long while, I have been very fascinated with all forms of syncretism, specifically carribbean syncretism.

For various private reasons I am exposed to quite a bit of information about social anthropology, and I usually find the descriptions of odd rites in various societies very amusing and enlightening.

For example, any diagram of multi-family cross-cousin-marriage in some african societies just brings out the graph theory nerd in me, and serious scientific texts debating the difference between endo- and exocannibalism (eat your own tribe vs. eat the other tribe) are a fun diversion from reading dry stuff all day.

Yet I was unprepared for reading about the "Cargo Cult" today. And thinking about it, the sheer fact that a cargo cult developed in Melanesia makes me want to laugh and cry at the same time.

Read it. It's worth it.

Friday, September 08, 2006

Matasano refers to Bleichenbachers' recently published attack. Tremendously short comment:

Anything that does RSA with low exponent is likely attackable. And padding should always be OAEP. ;)
After all the Brouhaha surrounding the work on Apple wireless drivers, I'd like to pitch my two cents:
  • Who cares wether this is real or not ? The possibility of breaking NIC drivers (especially in multithreaded kernels) is real, and nobody should be surprised if this happens. Has anyone ever disassembled the pos drivers that come with every cheap electronic USB gadget ? I have my doubts that the QA for NIC drivers is a lot better
  • It seems we are not the only ones with a similar problem: http://eprint.iacr.org/2006/303.ps
In the above paper, Eric Filiol says he has broken E0, but does not give any description of the analysis - just a (significant) number of keys that lead to very long strings of zero's or to keystreams with a predefined hamming weight.

I am not decided on the paper yet - read it yesterday evening, jetlagged, over half a bottle of wine. This sort of publishing would be very easy for hash functions -- I would believe anyone that he can build secondary pre-images (or even pre-images) from MD5 if he can give me a string of input that hashes to "thequickbrownfox....".

Now, we just need stuff like that for bugs ;-)

Monday, August 21, 2006

Now with all this noise surrounding the ConsumerReports article where they created 5500 new virus variants, I would really like to get my hands on their sample list to see how VxClass, our malware classification engine, deals with them.

Friday, August 11, 2006

Just to clarify: PaiMei is really good, the previous post was not supposed to be negative or detrimental -- it's definitely cool stuff.
From Matasano:

"The results of one trace can be used to filter subsequent traces. This is huge (in fairness: it’s something that other people, notably Halvar [I believe], have been working on)."

I have to admit that our flash movies that we posted last year in September are mind-numbingly boring, but they do show this sort of stuff ;) -- BinNavi was able to record commentable debug traces since day 1.

http://www.sabre-security.com/products/BinNavi/flash_binnavi_debugger.html
http://www.sabre-security.com/products/BinNavi/flash.html

The entire idea of breakpointing on everything and doing differential debugging dates back to at least a Blackhat presentation in Vegas 2002. Fun stuff, and good to see that with PaiMei there is finally a free framework to do this.

I really need to re-do the BinNavi movies in the next weeks, they really do not do our product any justice any more.

To continue shamelessly plugging my product :-):

"Can I have stack traces for each hit? I know they’re somewhat redundant, but I can graph them to visualize control flow (in particular, to identify event and “parse” loops)."

You can in the next release (scheduled for October) where you can attach arbitrary python scripts to breakpoints and thus do anything to memory you want.

"Symbols. Pedram acknowledges this in his presentation. It didn’t slow me down much not to have them, but it feels weird."

If IDA has them, BinNavi has them.

"I need to be able to click on a hit and see the assembly for it (if there’s a way to click on something and have it pop up in IDA, so much the better)."

Right-click->open subfunction in BinNavi ;)

"Yeah, I need this for non-Windows targets. Remote debugging is apparently coming, which will help. I don’t imagine Pedram’s working on SPARC support (X86 and Win32 has eaten its way pretty thoroughly through the code). Also,"

We have Linux/ptrace support and a (very experimental) WinCE/ARM support.

I promise to redo the movies in the next weeks.

Enough of the advertisement crap.

Cheers,
Halvar

Wednesday, July 26, 2006

The security world never ceases to amaze me. A few years ago, a few friends of mine would run around security conferences and drunkenly yell "fuzz tester ! fuzz tester !" at people that, well, fuzzed. I found this really hilarious.

What I find amazing though is that fuzzers are now being seriously discussed in whitepapers and even called "artificial intelligence". Folks, can we please NOT do the time warp again ? And can we please start writing about something new ?

On a side note: Since I am a bit of a language nerd, I can't fail to notice that "artificial intelligence" takes a semantically cool twist when mentioned in the same sentence as "yellowcake from africa".

PS: This post is a rant about people that write about fuzzing as a new threat, not about people that write and use fuzzers. Just to clarify :)
I will have an 8-hour layover in Toronto tomorrow -- anyone up for a coffee ?

Tuesday, July 11, 2006

The article at this link is a bit funny, but if it is true that Materazzi made racial slurs against Zidane, then his headbutt was the ONLY proper answer to that.

Racism on the pitch should not be tolerated under any circumstances, and a healthy team would not tolerate racist remarks from any team member.

If Zidane's reaction was a response to racist remarks, then his headbutt is a symbol for a world cup that did not tolerate racism, and that united people from all over the world instead of dividing them.

On a side note, I am very happy for all the Italians :-) and I'd like to thank my Italian neighbours for having invited us to their place to watch the final.

Enough football, now back to work.

Monday, July 10, 2006

I know that I am going to draw the hate of many people for this post, but I refuse to think less of Zidane for the headbutt against Materazzi. As strange as it sounds, for some reason I am quite convinced that he must have had a good reason for this.

Nobody is mad enough to just headbutt an opponent in the worldcup finals in the last game of a legendary career unless he has a very good reason.

But well.

Tuesday, July 04, 2006

Question for the Blogosphere: Does anyone know of a real-life crypto protocol in which Diffie-Hellmann over a finite field is used, and that finite field is NOT a prime field ? To be exact, I am looking for examples of real-life crypto using Diffie-Hellmann over GF(p^m) where m > 1.

Sunday, July 02, 2006

This Ebay posting for a Yacht that was previously owned by China's Minister of Defense might in fact be a bargain -- I would assume one automatically buys not only the yacht but also some state-of-the-art (of the mid-90's) electronics. I am not sure if that is still worth 2m USD, but still.

Saturday, July 01, 2006

I used to read security blogs via http://www.dayioglu.net/planet/ , which now seems down.
It's amusing how quickly I have quit reading blogs since. Funny world.

Saturday, June 24, 2006

On bug disclosure and contact with vendors

After reading HDM's blog entry on interaction with MS on one of the recent bugs, I guess I should drop my 2c's worth of opinion into the bowl regarding bug disclosure:

So sometimes I get the urge to find bugs. Then I go out and sometimes I find bugs. Then I usually feel quite happy and sometimes I even write an exploit. I do all this out of personal enjoyment -- I like bugs. I like having to play carambolage billard to get an exploit to work (meaning having to bounce things off of each other in weird angles to get stuff to work). Now, of course, once I am done I have several options on what to do with a bug.
  1. Report it to the vendor. This would imply the following steps, all of which take up time and effort better spent on doing something interesting:
    1. Send mail to their secure@ address, requesting an encryption key. I think it is amusing that some vendors like to call security researchers irresponsible when the default channel for reporting vulnerabilities is unencrypted. That is about as irresponsible as the researchers talking about vulnerabilities on EFNET.
    2. Get the encryption key. Spend time writing a description. Send the description, possibly with a PoC.
    3. MSRC is a quite skilled bunch, but with almost any other software vendor, a huge back and forth begins now where one has to spend time explaining things to the other side. This involves writing boring things explaining boring concepts etc.
  2. Sell it to somebody who pays for vulnerabilities. While this will imply the same lengthy process as mentioned above, at least one can in theory get paid for it. Personally, I wouldn't sell bugs, but that could have several reasons:
    1. I am old and lame and can't find bugs that are good enough any more
    2. The few bugs that I find are too close to my heart to sell -- each good bug and each good exploit has a story, and I am not so broke that I'd need to sell something that I consider inherently beautiful
    3. I don't know the people buying these things. I don't know what they'd do with it. I wouldn't give my dog to a total stranger either.
  3. Keep it. Perhabs on a shelf, or in a frame. This implies zero effort on my side. It also gives me the joy of being able to look at it on my wall and think fondly of the story that it belonged to.
So in case of 1), after having spent weeks on a bug, I have to spend more time doing something unenjoyable, and get a warm handshake with the words 'thanks for helping secure (the internet/the world/our revenue stream'.
In case 2), I get a warm handshake, some money, and a feeling of guilt for having given my dog to a total stranger.
In case 3), I have something to look at with fond memories and have to invest no time at all into things that I don't find interesting.

What would be your choice ?

Friday, June 23, 2006

I really enjoyed reading Ilfak's blog post today :-) -- it always makes me happy to see clever abstractions and the results they produce. And I really enjoy original ideas (of which there seems to be a very finite amount in IT :)

Monday, June 12, 2006

Compression, Statistics and such

In the process of doing the usual stuff that I do when I do not struggle with my studies, I ran into the problem of having a number of streams with a very even distribution of byte values. I know that these bytes are executable code somehow encoded. I have a lot of reason to suspect that they are compressed, not encrypted, but I have not been able to make sense of it yet.

This brought me to the natural question: Do common encryption algorithms have statistical fingerprints that would allow them to be distinguished from one another, more-or-less irrespective of the underlying data ? It is clear that this gets harder as the amount of redundancy decreases.

It was surprising (at least for me) that nobody else has worked on this yet (publically).

Also, it made me regret that due to some time constraints involving some more algebraic courses I was unable to attend the Statistics I and II lectures given at my University by Prof. Dette. Had I attended, I would know better how to make sense of the capabilities that software like R could give me.

Another example of the fundamental law of mathematics: For every n denoting the number of days you have studied mathematics there exists a practical problem that make you wish you had studied 2n days already.

Monday, June 05, 2006

Some shameless self-promotion: Rolf and me are going to teach a special one-day class on BinDiff 2 at BlackHat Las Vegas this year:

http://www.blackhat.com/html/bh-usa-06/train-bh-us-06-hf-sabre.html

We'll cover applications of BinDiff to malware analysis, detecting Code Theft and GPL violations, and of course the usual patch analysis.

Saturday, June 03, 2006

Sunday, May 28, 2006

My prediction for the next two years: Apple, Symantec, McAffee, Oracle etc. will get pounded into the ground by lots of bugs being found and disclosed through security researchers that are looking for easier targets than the current MS codebase. And the abovementioned companies won't have monopoly revenue to throw around and fix the issues.

This is a big opportunity for MS to move into all their markets :-) and sell their products as superior on the security side.

While I am in "evil" mood: The german train system is about to be IPO'ed, and there's a lot of debate going on here about details of the contract. What is most interesting but not being debated:
All real estate owned by the Deutsche Bahn AG (the privatized version of the german train system that is going to be floated) is in the books with it's value upon acquisition -- meaning it's value in 1935. The real estate in possession of the DB is, by today's value, worth several times more than the total money they expect to get out of the IPO.

If I was an investment banker, I'd gang up with a bunch of private equity folks, buy the majority in the DB AG once it is IPO'd, and then sell of the real estate. Other countries (USA, Britain) survive without a decent train system, too, and I wouldn't care as I'd have a Rolls and a driver.

Allright, enough of the devil's advocate mode. It was fun seeing my brother the last weekend,
and we always come up with good ideas ;)

Tuesday, May 23, 2006

MSASN1 is hard to read these days -- the code makes heavy use of carry-flag-dependent arithmetic (adc, rlc etc) to check for integer overflows.

Saturday, May 20, 2006

The Vodafone virus dropped by today and brought us some mobile viruses to play with - thanks ! :-)

So cross-platform diffing can be fun -- Rolf ran a diff of Commwarrior.B against Commwarrior.C today, and while B is compiled for standard ARM, C is compiled in 'thumb mode', which is pretty much the same as being compiled for a different CPU (thumb means that all instructions are different).

The amusing result is that even though the compilation is for a different platform, we still get roughly 61% of the functions matched. And the functions, which are clearly the same on the 'structural' (e.g. flowgraph) - level, have completely different instructions, and manual inspection will confirm that these differing instructions end up doing the same.

For those of you that want to verify things manually, click here.
Quote from Lock, Stock and Two Smoking Barrels: "I don't care who you use as long as they are not complete muppets".

Having MSOffice 0day is not terribly hard, but one should not burn it by making it drop standard, off-the-shelf, poorly-written bot software. The stealth advantage that one has by sending .DOC files into an organisation should not be given up by creating empty SYS files or dropping DLLs.
Also, registry key adding for getting control on reboot is kinda suboptimal.

I am kinda curious to know how they got caught, but my guess is that the bad QA on the internet explorer injection raised enough crashes to make people investigate.

On a side note, this highlights a few common problems people face when doing client side attacks:
  • One-shot-ness -- any exploit you write is a one-shot and should work reliably
  • Process recovery -- any exploit you write needs to be able to recover and have the exploited application resume as if nothing happened. This is a tad hard if you've written 200 megs of garbage to the heap.
  • Lack of complete pre-attack intel on the target environment -- I don't know what went wrong when they injected into iexplore, but they must've been confident that their code was good enough. This means they tested it on a testbed which didn't reflect the actual target.
  • Lack of attack focus -- I am quite convinced that they could've had a simpler, stealthier, and more stable bot component if they had thought more thoroughly about what their goal in this attack was
Enough ranting for today.

Friday, May 19, 2006

For those that are into malware classification, here's some code that one
can include in a piece of malware to skew the Levenshtein distance described
in the recently published MS paper.

int j, i = random_integer_in_range(0, 50000);
FILE *f;
for( j = 0; j < i; j++ ){
f = fopen("c:\\test.txt", "rt");
flose(f);
}

Tuesday, May 16, 2006

Behavioural classification of malware

Today is a good day: I got my math homework done before it has to be handed in, and that leaves me some free time to blog :-)

Dana Epp has a post referring to an entry by Tony Lee referencing an EICAR paper on automated malware classification using behavioural analysis. I am not totally unbiased on this as we at SABRE have been working a lot on structural classification of malware recently, so take my following criticism with a grain of salt.

I personally think the approach in the paper is suboptimal for the following reasons:
  1. By using behavioural data, we can only classify an executable based on things it does in the observed timeframe. Any time-delayed trigger (that e.g. triggers two months from now) is hard to see, and the application might just sleep until then. How do we classify something that just came into our networks ? We can't classify it until it starts becoming "active".
  2. It is trivial even for somebody who knows only rudimentary programming to modify a program so that the modifed program only has a few (~4 ?) lines of code more than the original program, yet it's Levenshtein distance as measure in the paper is arbitrarily large. As it stands, adding file writes in a loop should be enough, and the Levenshtein distance can be arbitrarily increased by more loop iterations.
  3. The paper cites on-access deobfuscation as a principal problem that static analysis cannot easily deal with -- but from (2) it follows that on-access deobfuscation can be coupled with Levenstein-distance-maximizing code in a trivial manner, breaking the approach that was proposed as superior in the paper. The claim that runtime analysis can effectively bypass the need to deal with obfuscation is simply not true if the application ever targets the event collection by 'junking' it with bogus events.
Taken together this means that the approach presented in the paper can be trivially foiled with very minor high-level-language modifications in the source of the program, whereas the static structural comparisons we use need to be foiled via the use of a strong obfuscation component, which if done moderately cleverly, would also foil the approach from the paper.

I'm not saying the paper isn't good or doesn't touch valid points, but behaviour is so trivially randomized even from a high-level-language level that the approach in the paper is next to pointless once malware authors target it.

On to something kinda different:

A more general question we have to ask ourselves is: Do we really want to measure the accuracy of new, automated malware classification algorithms by comparing them to the results of the manual classification done by AV-vendors so far, which had neither efficient information sharing nor any clear methodology as to how to name malware ? Using any sort of machine learning based on the AV-industry provided datasets needs to be very resilient to partially incorrect input data, as a good number of bots seem to be more or less arbitrarily named.

Anyhow, time to go to sleep and read Courtois eprint paper

Friday, May 12, 2006

Microsofts built-in firewall has some really annoying things to it. I am running a laptop connected to an untrusted network and an instance of VMWare connected on a different interface. If I disable the firewall on the VMWare interface, it automatically gets disabled on the global interface. Very cute. Can we get this fixed ?

Monday, May 08, 2006

Important German saying:
"Wer keine Probleme hat macht sich welche"

"Those that do not have any problems will create some for themselves"

Saturday, April 29, 2006

More on automated malware classification and naming

So after having posted some graphs without further explanation yesterday, I think it is a good idea to actually explain what these graphs were all about.

We at SABRE have worked on automated classification of malware over the last few weeks. Essentially, thanks to Rolf's relentless optimization efforts and a few algorithmic tricks, the BinDiff2 engine is blazingly fast. As an example, we can diff a 30000+ functions router image in under 8 minutes on my laptop now, and this includes reading the data from the IDA database. That means that we can afford to run a couple of hundred thousand diffs on a collection of malware samples, and to then work with the results -- malware is rarely of a size exceeding 1000 functions, and anything of that size is diffed in a few seconds.

So we were provided with a few thousand samples of bots that Thorsten Holz had collected. The samples themselves were only marked with their MD5sum.

We ran the first few hundred through a very cheap unpacking tool and then disassembled the results. We than diffed each sample against each other. Afterwards, we ran a phylogenetic clustering algorithm on top of the results, and ended up with this graph:

The connected components have been colored, and a hi-res variant of this image can be looked at here.
The labels on the edges are measures of graph-theoretic similarity -- a value of 1.000 means that the executables are identical, lower values give percentage-wise similarity. We have decided in this graph to keep everything with a similarity of 50% or greater in one family, and cut off everything else.

So what can we read from this graph ? First of all, it is quite obvious that although we have ~200 samples, we only have two large families, three small families, two pairs of siblings and a few isolated samples. Secondly, even the most "distant relatives" in the cyan-colored cluster are 75% similar, the most "distant relatives" in the green cluster on the right are still 58% similar. If we cut the green cluster on the right into two subclusters, the most distant relatives are 90% similar.

Now, in order to double-check our results with what AV folks already know, Thorsten Holz provided us with the results of a run of ClamAV on the samples, and we re-generated the graphs with the names of those samples that were recognized by ClamAV.

The result looks like this:

(check this graphic for a hi-res version which you will need to make sense of the following ;)

What can we learn from this graph ? First of all, we see that the various members of the GoBot family are so similar to the GhostBot branch that we should probably consider GoBot and GhostBot to be of the same family. The same holds for the IrcBot and GoBot-3 samples and for the Gobot.R and Downloader.Delf-35 samples -- why do we have such heavily differing names when the software in question is so similar ?

We seem to consider Sasser.B and Sasser.D to be of the same family with a similarity of 69%, but Gobot.R and Downloader.Delf-35, which are a lot more similar, have their own families ?

What we can also see is that we have two samples in the green cluster (GoBot, IRCBot, GhostBot, Downloader.Delf) that have not been recognized by ClamAV and that seem to have their own branch in the family tree:

Apparently, these two specimen are new veriants of the above family tree.

Miscallenous other things we can learn from this (admittedly small) graph:

  • We have 6 variants of Trojan.Crypt.B here that go undetected by ClamAV
  • We have an undetected variant of Worm.Korgo.Z
  • PadoBot and Korgo.H are very closely related, whereas Korgo.Z does not appear to be very similar. Generally, the PadoBot and the Korgo.H/Korgo.Y/Korgo.AJ family seem to be just one family.

So much fun.
Anyhow, if you guys find this stuff interesting, drop me mail about this at halvar.flake@sabre-security.com ...

Friday, April 28, 2006

Also, Rolf has created a shockwave where he shows how to use BinDiff to find GPL violations.

Check it here...
Automated classification of malware is easier than it seems once you have the right infrastructure -- in our case consisting of the BinDiff2 engine, a few generic unpacking tools and a graph layout program.

I have uploaded some graphics:

This is just a collection of arbitrary malware whose members were identified by use of AV programs. We then BinDiff'ed all samples and used some phylogenetics algorithm to draw this diagram. The results are quite neat, although we did not filter library functions, so some very simple viruses have high similarity due to the fact that 95% of their code is statically linked library code.

This is a collection of a few hundred bots. They were collected on Thorsten Holz's Honeynet, and we auto-unpacked them and then did the BinDiffing/tree generation. This time, we did filter libraries as good as we could. The 184 samples here all have different MD5sums, but the
largest majority belongs to essentially two families. All in all, we have ~5 "families", two pairs
of "siblings" and 9 isolated species here. Fun.

Wednesday, April 26, 2006

I am in a state of brainfry. Which I guess is good. Exhaustion can be fun. I think I haven't worked as hard as I am working at the moment in quite a while, and while the results move slowly (due to fighting on many fronts), they move, and in a good direction. Now I will sleep. Ah, for those into this sorta stuff, here is what claims to be a significant improvement of the FMS attack on RC4. Good read, although nothing worldmoving, and I haven't verified the results fully yet.

Wednesday, April 19, 2006

Quote from the Matasano Blog:
"We are so funded now"
^^ Classic :)
Language in thought and action or "what's correct vs. what's right"

Everybody should IMO, at some point of his life, have read Hayakawa's "Language in thought and action". It is a book about semantics, of words and their meanings. Semantics pop up in very amusing situations, for example in code analysis/review, and a few other places.

A small thing to take out of that book is that the entry in a dictionary is not the definitive meaning of a word, but that the meaning of the word is (somewhat circularly) defined as what the collective mind thinks this word means.

There's currently an bugtraq about the way that GCC treats certain 'invalid language' constructs. While I find very few things more amusing than compiler changes breaking formerly working code, I also find the reaction of everybody except fefe on that list to be very amusing: Instead of relying on what everybody thinks the meaning of that line of code is, they refer to the written standard. In essence, they look up the meaning of a word in a dictionary.

Yes, standards are important. But there is a difference between the standard on paper and the 'standard', meaning the way everybody perceives (and writes in) that language. And due to the fact that C's standard has been in some ways unenforced for 20+ years there are lots of existing 'standard' idioms (such as Fefe's int wrap check) that are not valid by the written standard.

What we're trying to do here is the equivalent of not correcting a child's grammar for 20 years, allowing it to play with other kids with broken grammar and always understanding what the child wanted to say when it used broken grammar. And then, once it turned 20, we cross out the constructs with bad grammar from everything he says and interpret the rest.

If the construct is so invalid, have the compiler warn at least :)
This sounds like a sure recipe for disaster to me :)

I had a discussion with Rolf yesterday about GCC's ghastlyness, and the fact that we're spending a good bit of our time fighting compiler woes instead of coding. Then again, we're doing something rather odd: We're trying to write C++ code with GCC, which frankly, doesn't work. I am very inclined to switch to Intel's CPP.

Oh, while I am at ranting: On which CPU is GCC's weird optimization of passing arguments by subtracting from ESP and then issuing several 'mov [esp+const], reg' faster than push/call ? Sure, on a cycle-by-cycle basis each instruction is faster, but has anyone ever considered the impact of a 3*size increase in the code on CPU caching ?

I'll shut up now before this deteriorates into more ranting.

Monday, April 17, 2006

http://teh-win.blogspot.com/ has (as usual) an amusing read up, which at one step harps on a point that I can't support enough: 0days != hacking. Almost all "real" hacking is done via transitive trust (thus the same goes for pentests). 0days allow you to more quickly get _some_ trust to exploit transitively, but the "real" work is done on transitive trust. And transitive trust and "real" hacking gets too little credit at security conferences, mainly because any "real" research here is by direct implication illegal ("... I wrote this worm that exploits transitive trust ... and I have some empirical data on it's spreading capabilities *cough* ...").

Now I just need to find a dictionary that explains me what "branler la nouille en mode noyau" means ;)
Publication Economics and Cryptography Research

Something I cannot cease to wonder is why historically there has been so little published research on the cryptanalysis of block ciphers. There seem to be millions of articles describing "turning some math guy's favourite mathematical problem into an asymetric crypto algorithm" and a similar flood of "fair coin flipping if all participants are drunk cats and the coin is a ball of yarn"-sort of papers. All in all, there have been perhabs less than 20 REALLY important papers in the analysis of symetric crypto in ... uhm ... the last 10 years (I count hashes as symetric crypto here).

What's the reason for this ?

First of all, symetric crypto tends to not have a "nice" mathematical structure. This changed somewhat with AES, but almost everything on the symetric side is rather ugly to look at. Sure, everything can be written as large multivariate polynomials over GF(2), but that's just a prettier way of writing a large boolean formulae. So it's hard for anybody in a math department to justify working on something that is "like a ring, but not quite, or like a group, but not quite".

Secondly, starting to build a protocol or proposing a new asymetric cipher is something that a sane researcher (that has not earned tenure yet) can do in a "short" window of time. Setting out to break a significant crypto algorithm could very easily lead to "10+ years in the wilderness and a botched academic career due to a lack of publications". The result: If you haven't earned tenure yet, and want to work in crypto, you work on the constructive side.

I find this to be a bit frustrating. I'd like to work on ciphers, specifically on BREAKING ciphers. I seriously could never get myself excited about defense. I wouldn't mind spending a few years of my life on one cipher. But academically, and career wise, this is clear suicide.

Perhabs we should value "destructive" research more. From my personal viewpoint, a break in a significant cipher is worth more than 20 papers on fair coin flipping in the absence of gravity. But when it comes to giving out tenure, it almost seems that the 20 papers outweigh the one.

Ahwell. Enough ranting, back to work.

Friday, March 31, 2006

http://metasploit.blogspot.com/2006/03/few-msrt-graph-illustrations.html

This is cool.

Wednesday, March 15, 2006

My latest travel madness is over, and after 4 weeks on the road I am home again. The last week of that travelling I spent mostly sick -- that means I went to BlueHat for all the work and then had to skip all the fun. Furthermore, I learnt a bunch of interesting things concerning sinusitis and pressure changes during intercontinental flights.

Those that know me know I am a graph nerd (and by extension an OBDD nerd), so check
this paper for some more OBDD fun.
While you are doing your brain some good, you might as well read this paper -- it is fascinating and very useful at the same time.

Now, I am just enjoying the second day of my first "weekend" in a month -- I decided not to work for two days, and spent the first 34 hours of these two days in bed. Now I am cleaning my flat and will be reading some fun shit lateron (compact riemann surfaces perhabs?). Or I will call Ero and see if he is up for a round of Go.

I saw that Ilfak reads my BH presentations -- I am flattered and thus will need to try harder with them the next time.

One of the highlights of BlueHat was meeting a bunch of MS folks that I respect A LOT: Jon, Brandon, Alex, Pete, Jan -- thanks for being awesome.

I read many good things about OCaml recently, and a bunch of very interesting programs (ASTREE and SLAM/SDV) seem to be written in it. Can someone offer some comments on it ?

Sunday, February 26, 2006

My travel schedule remains insane, and I notice that thinking about 'hard' problems is very difficult while travelling. Sigh. Anyhow, those that had beer with me in the last year and a half or so know that I have a severe interest in OBDD-like structures and their applications to cryptanalysis.
So I just wanted to put a reference here to
http://eprint.iacr.org/2006/072.pdf

Saturday, February 18, 2006

We at SABRE will be giving a 3-day advanced reverse engineering course this fall in Frankfurt. Check this link if you are interested in that sorta stuff.

Tuesday, February 07, 2006

History is important. It is VERY cool to see DaveG posting old 8LGM memorabilia. Thanks !

I have to hit the train now. Will read this
on the train -- skimmed it in the subway, and it looks ridiculously interesting (if you happen to be into solving systems of multivariate polynomial equations or into classical algebraic geometry in general). It claims to contain a bridge between multivariate polys over finite fields and univariate polynomials over certain extensions of the same field. We'll see how good it actually is.
Hans Dobbertin died of cancer on the 2nd of February. I did not know him well - I had switched universities in order to write my thesis under his supervision, but shortly after I had arrived at the new university he fell ill. The few times that we talked it was a lot of fun though, and I respected him greatly. There are only a limited number of professors I met in my life that really understood and seemed to appreciate what I am doing, and it did not take much to explain it to Prof Dobbertin (which might have to do with his work prior to becoming a university professor). My condolescences to his family and those that were close to him.

Aside from the personal sadness I feel it is a setback for my Diplomarbeit. But well, that is nothing in the big picture really.