Abstract:
In the problem of information sharing, two goals must be met to fulfill the require
ments of both information providers and information consumers. That is, the information
providers have constraints of data security/privacy protection, while the information con
sumers are interested in particular information and want to acquire such information as
much as possible.
To solve this problem, disclosure algorithms are applied in the information disclosure
process so information providers can compute what to share.
However, in a typical information disclosure process, the applied disclosure algorithm is
constructed to check the goals solely on the exact disclosed data of the algorithm's output.
This leads to serious security problems in that a malicious information consumer, or namely,
the adversary, may be able to acquire additional information from the disclosure algorithm
itself that violates the security/privacy constraints of the information providers.
This dissertation presents a number of techniques for answering basic questions about
the problem of information sharing: how secure is an information disclosure process, when
the disclosure algorithm is known to the public, and if it is not secure, how can we make it
so?
This dissertation starts by extending an existing solution to the problem of online query
auditing, i.e., whether a posed information request from the information consumer should
be permitted or not.
In the problem of online query auditing, an adversary may acquire more precise infor
mation than what has been disclosed by the information providers based on the knowledge
he or she obtained from the fact that some information requests have been denied. The
existing solution, called simulatable auditing, does solve the problem partially by achieving
the first goal, which is guaranteeing the security constraints of the information provider.
However, it fails to achieve the second goal. That is, many information requests from information consumers will be denied unnecessarily to cause a significant data availability
downgrade. This dissertation proposes a new solution that achieves both of the goals by
identifying a su±cient and necessary condition for guaranteeing the data protection of the
information providers.
This dissertation then studies a more relaxed problem, the problem of microdata disclosure, in which the disclosure algorithm has to choose what to disclose from multiple
candidate data/datasets. The problem of checking whether the security/privacy protection
of the information provider has been violated turns out to be much harder in this case, i.e.,
the general case is an NP-complete problem. This problem has not been given enough atten-
tion, and most existing solutions su®er from a failure of the desired data security/privacy
protection. This dissertation presents a new model to design safe disclosure algorithms
that at least guarantee the data protection of the information providers. Heuristic algo
rithm design is also proposed to achieve an acceptable good performance for real-life data
applications due to the hardness of the problem.
Finally, this dissertation addresses an open problem of how to restore the data security/privacy when it has already been compromised by incidental data disclosure, which is
unavoidable when multiple information providers are disclosing the same set of information
without collaboration or centralized control. This dissertation shows that, under certain
conditions, this can be accomplished by applying a statistical approach.