Data Leakage Detection
0
Data Leakage Detection
ABSTRACT:
A data distributor has given sensitive
data to a set of supposedly trusted agents (third parties). Some of the data is
leaked and found in an unauthorized place (e.g., on the web or somebody’s
laptop). The distributor must assess the likelihood that the leaked data came
from one or more agents, as opposed to having been independently gathered by
other means. We propose data allocation strategies (across the agents) that
improve the probability of identifying leakages. These methods do not rely on
alterations of the released data. In some cases we can also inject “realistic
but fake” data records to further improve our chances of detecting leakage and
identifying the guilty party.
EXISTING SYSTEM:
Traditionally, leakage detection is
handled by watermarking, e.g., a unique code is embedded in each distributed
copy. If that copy is later discovered in the hands of an unauthorized party,
the leaker can be identified. Watermarks can be very useful in some cases, but
again, involve some modification of the original data. Furthermore, watermarks
can sometimes be destroyed if the data recipient is malicious. E.g. A hospital may give patient records
to researchers who will devise new treatments. Similarly, a company may have partnerships
with other companies that require sharing customer data. Another enterprise may
outsource its data processing, so data must be given to various other
companies. We call the owner of the data the distributor and the supposedly
trusted third parties the agents.
PROPOSED SYSTEM:
Our goal is to detect when the
distributor’s sensitive data has been leaked by agents, and if possible to
identify the agent that leaked the data. Perturbation is a very useful
technique where the data is modified and made “less sensitive” before being
handed to agents. we develop unobtrusive
techniques for detecting leakage of a set of objects or records.
In this section we develop a model for
assessing the “guilt” of agents. We also present algorithms for distributing
objects to agents, in a way that improves our chances of identifying a leaker.
Finally, we also consider the option of adding “fake” objects to the
distributed set. Such objects do not correspond to real entities but appear
realistic to the agents. In a sense, the fake objects acts as a type of
watermark for the entire set, without modifying any individual members. If it
turns out an agent was given one or more fake objects that were leaked, then
the distributor can be more confident that agent was guilty.
MODULES:
1. Data
Allocation Module:
The main focus of our project is the
data allocation problem as how can the distributor “intelligently” give data to
agents in order to improve the chances of detecting a guilty agent,Admin can
send the files to the authenticated user, users can edit their account details
etc. Agent views the secret key details through mail. In order to increase the
chances of detecting agents that leak data.
2. Fake
Object Module:
The distributor creates and adds fake
objects to the data that he distributes to agents. Fake objects are objects
generated by the distributor in order to increase the chances of detecting
agents that leak data. The distributor may be able to add fake objects to the
distributed data in order to improve his effectiveness in detecting guilty
agents. Our use of fake objects is inspired by the use of “trace” records in
mailing lists. In case we give the wrong secret key to download the file, the
duplicate file is opened, and that fake details also send the mail. Ex: The
fake object details will display.
3. Optimization
Module:
The Optimization Module is the
distributor’s data allocation to agents has one constraint and one objective.
The agent’s constraint is to satisfy distributor’s requests, by providing them
with the number of objects they request or with all available objects that
satisfy their conditions. His objective is to be able to detect an agent who
leaks any portion of his data. User can able to lock and unlock the files for secure.
4. Data
Distributor:
A data distributor has given sensitive
data to a set of supposedly trusted agents (third parties). Some of the data is
leaked and found in an unauthorized place (e.g., on the web or somebody’s
laptop). The distributor must assess the likelihood that the leaked data came
from one or more agents, as opposed to having been independently gathered by
other means.Admin can able to view the which file is leaking and fake user’s
details also.
Hardware Required:
System : Pentium
IV 2.4 GHz
Hard Disk :
40 GB
RAM : 256 MB
Software Required:
O/S : Windows
XP.
Language : JAVA, JSP
0 comments: