This paper focuses on distinguishing human-generated activity from bot-generated activity. Then, human-generated activity can be given preferential treatment (e.g. favorable routing of traffic, not being treated as spam). Their measure for distinguishing human-generated actions from machine-generated actions is pretty coarse and imprecise: an action is human-generated if it is preceded by keyboard or mouse input within a certain amount of time.
To implement this scheme, they go into considerable (exhaustive) detail about how to use the Trusted Computing Module (TPM) to build a trusted path between the physical input devices (keyboard, mouse) and a small piece of software called the attestor. To certify an action as human-generated, applications ask the attester for an attestation, passing a hash of the content to the attested for. If there has been user input within a predefined period, the attester returns a cryptographically-signed token that can be attached to the user action. When an upstream service receives the user action (e.g. HTTP request, email), it can verify the attestation by hashing the content of the action, and checking the cryptographic signature. Incorporating the content hash prevents an attestation for action x being used instead with action y. The verifier also needs to check that attestations are not reused, so the attester includes a nonce in the attestation token.
It is possible that a bot can monitor user actions, and submit malicious content to the attester whenever the user uses an input device. This would allow attestations to be created for malicious content, which means upstream software cannot blindly trust attested-for content. To reduce the impact of this attack, the paper suggests rate-limiting attestations to one per second.
Discussion
I liked how the paper discussed an alternative approach to the same problem (having the attester track keyboard and mouse inputs, and then match that recorded history against the content that is to be attested, looking for a correspondence). Many papers present the solution they chose as the only alternative, when in fact it usually represents only one point in a much richer design space.
In some sense, the inverse of the proposed functionality would be more useful: i.e. being able to assert “this content was definitely bot-generated.” As proposed, it might be very hard for upstream services to make use of the certifications unless this idea saw widespread adoption. For example, suppose that 0.1% of your traffic is guaranteed by NAB to be human-generated. The remaining traffic may or may not be bot-generated, so you effectively can’t discriminate against it.
The paper suggests that, using this approach, human-generated content on a bot-infested machine can be effectively distinguished from bot-generated traffic. This seems pretty unlikely: the bot software can simply suppress all outgoing attestations (e.g. by installing a rootkit and interfering with the API used to request attestations), leaving upstream software in the same state as they would be without NAB.