Engineering Project Trends

Wednesday, February 28, 2018

Hybrid scheme of public-key encryption

We introduce a hybrid homomorphic encryption that combines public-key encryption (PKE) and somewhat homomorphic encryption (SHE) to reduce the storage requirements of most somewhat or fully homomorphic encryption (FHE) applications.

In this model, messages are encrypted with a PKE and computations on encrypted data are carried out using SHE or FHE after homomorphic decryption.

To obtain efficient homomorphic decryption, our hybrid scheme combines IND-CPA PKE without complicated message padding with SHE with a large integer message space.

Furthermore, if the underlying PKE is multiplicative, the proposed scheme has the advantage that polynomials of arbitrary degree can be evaluated without bootstrapping.

We construct this scheme by concatenating the ElGamal and Goldwasser-Micali schemes over a ring ℤ_N for a composite integer N whose message space is ℤ_N^×.

Concept

The concept of computation on encrypted data without decryption was first introduced by Rivest, Adleman and Dertouzos in 1978.

Thirty years later, Gentry proposed a fully homomorphic encryption (FHE) based on ideal lattices.

This scheme is far from being practical because of its large computational cost and large cipher texts.

Since then, considerable efforts have been made to devise more efficient schemes. However, most FHE schemes still have very large ciphertexts (millions of bits for a single ciphertext). This presents a considerable bottleneck in practical deployments.

Key encryption Process

Signature Searching in a Networked Collection of Files

Description

A signature is a data pattern of interest in a large data file or set of large data files. Such signatures that need to be found arise in applications such as DNA sequence analysis, network intrusion detection, biometrics, large scientific experiments, and speech recognition and sensor networks. Related to this is string matching.

More specifically we envision a problem where long linear data files (i.e. flat files) contain multiple signatures that are to be found using a multiplicity of processors (parallel processor).

This paper evaluates the performance of finding signatures in files residing in the nodes of parallel processors configured as trees, two dimensional meshes and hyper cubes.

We assume various combinations of sequential and parallel searching. A unique feature of this work is that it is assumed that data is pre-loaded onto processors, as may occur in practice, thus load distribution time need not be accounted for.

Elegant expressions are found for average signature searching time and speedup, and graphical results are provided.

Figure: Signature

Introduction

A signature is a relatively small data pattern of interest embedded in a very large (in this paper sequential) data file. It is assumed signatures are temporally distinct and do not overlap each other. That is, there can be multiple signatures in a file.

Because the files we study are much longer than the signatures, it is assumed that signatures have infinitesimally small length. Such signature searching occurs in network security, signal processing, medicine, image processing, and sensor technology and many other fields.

Most previous work on signature searching (known as template matching and string matching) develop algorithm for the detailed matching process. This paper, like addresses an upper level view of signature searching involving system performance evaluation.

However we now briefly summarize some string matching work. String searching, which is similar to our concept of signature searching, is a special case of pattern searching. String searching generically involves finding a pattern of length m in a text of length n over some alphabet.

The worst case complexity of exact string matching is O(n) but the proportionality constant of the linear term can be very different depending on the string matching algorithm, ranging from m for the naive algorithm to 2 for the Knuth-Morris-Pratt algorithm.

Tree network

Assume that in a tree a node is a structure which contains exactly one file. Each node has one, two or more child nodes, which are below it in the tree space. A node that has a child is called the child’s parent node.

A node has at most one parent. Nodes that do not have any children are called leaf nodes. They are also referred to as terminal nodes.

A node’s height is the length of the Multi-level tree network longest downward path to a leaf from that node. The root’s height is the height of the tree.

The depth of a node is the length of the path to its root. The nodes in the same depth of the tree are said to be at the same level.

The height of the multiple-level tree in this paper is H. Throughout this paper there is a single file, possibly containing signatures, in each node.

In all the cases for tree networks which we examine the number of children nodes ni per node is a random variable from 1 to N, possibly different for each level.

But at the same level the number of children in each subtree is the same. In this section if a parent node has a signature(s), its children nodes may also have signatures.

If the parent node does not have any signature of interest, its children nodes also have no signatures and do not need to be searched.

Some dependency between signature occurrences is thus being assumed. This may model locality of reference - if a node has a signature(s), there may be related information (signatures) on its children.

Figure: Tree Network

Mesh Networks

A regular two dimensional mesh network of processors. It is a commonly used interconnection network. In this network structure each processor is located in the corners of four rectangles and has four neighbors.

The central processor is called the originator; it can communicate information or transport data to its four neighboring nodes.

As in trees, the central node is assumed to be layer 0, and its four neighboring nodes in layer 1, the nodes which are neighbors to the nodes in layer 1 but not in layer 0 are in layer 2, . . ., the nodes which are neighbors to the nodes in layer i but not in layer i-1 are in layer i+1 in the mesh structure.

A node can only send messages or data to its neighboring nodes. If the node in the north of layer 2 wants to send a message to the node in layer 0, first it should send message to the node in the north top of layer 1, then the message will be transferred to layer.

Mesh network Assume that there are N layers and there is at most one signature in every node in a mesh network. There are 4i nodes in the i th layer. Different from the tree network, the case that every node can have a signature will be discussed in section.

That is, if a node has no signature, its children node scan still have signatures. Assume the average time to search every node in the mesh network is the same: X, and again there is at most one signature in one file.

Figure: Mesh Network

On-Line Portfolio Selection

Robust Median Reversion Strategy for On-Line Portfolio Selection

Portfolio Selection (PS) problem is concerned with determining a portfolio for allocating the wealth among a set of assets to achieve some financial objectives in the long run. There are two main mathematical models for this problem: the meanvariance model [Markowitz, 1952] and the Kelly investment [Kelly, 1956]. In general, mean-variance theory, which trades off between the expected return (mean) and risk (variance) of a portfolio, is suitable for single-period (batch) Portfolio Selection. So far as Robust Median Reversion is the first algorithm Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence 2006 that exploits the reversion phenomenon by robust L1-median estimator. Though simple in nature, Robust Median Reversion can release better estimation than existing algorithms and has been empirically validated via extensive experiments on real markets.

Robust Median Reversion Strategy

On-line portfolio selection has been attracting increasing interests from artificial intelligence community in recent decades. Mean reversion, as one most frequent pattern in financial markets, plays an important role in some state-of-the-art strategies. Though successful in certain datasets, existing mean reversion strategies do not fully consider noises and outliers in the data, leading to estimation error and thus non-optimal portfolios, which results in poor performance in practice. To overcome the limitation, the reversion phenomenon by robust L1-median estimator is proposed to be exploited, and a novel on-line portfolio selection strategy named “Robust Median Reversion” (RMR) has been designed, which makes optimal portfolios based on the improved reversion estimation. Empirical results on various real markets show that Robust Median Reversion can overcome the drawbacks of existing mean reversion algorithms and achieve significantly better results. Robust Median Reversion runs in linear time, and thus is suitable for large-scale trading applications.

FINGERPRINTING OF SOFTWARE-DEFINED NETWORKS

FINGERPRINTING OF SOFTWARE-DEFINED NETWORKS

Introduction

Software-defined networking (SDN) is an umbrella term encompassing several kinds of networktechnology aimed at making the network as agile and flexible as the virtualized server and storage infrastructure of the modern data center.

The Software-defined networking (SDN) eases network management by centralizing the control plane and separating it from the data plane.

The separation of planes in SDN, however, introduces new vulnerabilities in SDN networks, since the difference in processing packets at each plane allows an adversary to fingerprint the network’s packet-forwarding logic.

In this paper, we study the feasibility of fingerprinting the controller-switch interactions by a remote adversary, whose aim is to acquire knowledge about specific flow rules that are installed at the switches.

Network devices can also notify the controller about network events (e.g., the reception of certain packets) and device’s state changes.

For example, a number of advanced reactive control plane logic implementations configure network devices to send notification to the controller according to some installed policy (e.g., when a received packet does not match any of the installed flow rules).

This notification triggers the controller to perform a series of operations, such as installing the appropriate forwarding rules at the switches, reserve network resources on a given network’s path, etc.

Problem Facing

SDN separates the control and data planes by defining a switch’s programming interface and a protocol to access such interface, i.e., the Open Flow protocol .

The controller leverages the Open Flow protocol to access the switch’s programming interface and configure the forwarding behavior of the switch’s data plane.

The communication between the controller and switches is established using an out-of-band control channel.

Statements of problem

The main objective of our work is to study the ability of a remote adversary to identify whether an interaction between the controller and the switches (and a subsequent rule installation) has been triggered by a given packet.

The absence of a controller-switch interaction typically provides evidence that the flow rules that handle the received packet are already installed at the switches.

Otherwise, if a communication between the controller and the switches is triggered, then this suggests that the received packet requires further examination by the controller, e.g., since it does not have any matching entry stored at the switch’s flow table, or because the controller requires additional information before installing a forwarding decision at the switches.

In contrast, a passive adversary cannot inject packets in the network but only monitors the exchanged traffic between the server and the client. Notice that passive adversaries are hard to detect by standard intrusion detection systems since they do not generate any extra network traffic.

Setup for experiment

The controller is configured to minimise the processing delay for an incoming packet-in event, i.e., we only require the controller to perform a table lookup and retrieve pre-computed forwarding rules in response to packet-in events.

Furthermore, the controller always performs bi-directional flow installation; that is, the handling of a packet-in event triggers the installation of a pair of rules, one per flow direction, at each involved switch.

We ensure that the controller’s CPU is not overloaded during our measurements. We deploy a cross-traffic generator on an AMD dual core processor running at 2.5 GHz to emulate realistic WAN traffic load on the switches’ ports that were used in our study.

The generated cross traffic follows a Pareto distribution with 20 ms mean and 4 ms variance [7]. To analyze the effect of the data link bandwidth on the fingerprinting accuracy, we bridge our SDN network to the Internet using 100 Mbps and 1 Gaps links (respectively), by means of a firewall running on an AMD Athlon dual core processor 3800+ machine.

For the purpose of our experiments, we collect measurement traces between an Intel Xeon E3-1230 3.20 GHz CPU server with 16 GB RAM and 20 remote clients deployed across the globe. Table I details the specifications and locations of the clients used in our experiments.

In our testbed, the server and the software switch were co-located on the same machine. Note that, by reducing the time required for rule installation to a minimum, our tested emulates a scenario that is particularly hard for fingerprinting.

Collection of Data

To collect timing information based on our features, we deployed 20 remote clients across the globe that exchange UDP-based probe packet trains with the local server.

Notice that we rely on UDP for transmitting packets since Internet gateways may filter TCP SYN or ICMP packets. Each probe train consists of:

• A CLEAR packet signaling the start of the measurements. Upon reception of this packet, the controller deletes all the entries stored within the flow tables of the OpenFlow switches in Pcs. • After one second2 since the transmission of the CLEAR packet, the client transmits four MTU-sized packet pairs. Here, different packet pairs are sent with an additional second of separation.

• After one second since the transmission of the last packet pair, another CLEAR packet is sent to clear all flow tables.

• Two packets separated by one second finally close the probe train. We point out that all of our probe packets belong to the same network flow,

i.e., they are crafted with the same packet header. For each received packet of every train, the local server issues a short reply (e.g., 64 bytes). We maintain a detailed log of the timing information relevant to the sending and reception of the exchanged probe packets.

When measuring dispersion, we account for out-of-order packets; this explains negative dispersion values. For each of our 20 clients, we exchange 450 probe trains on the paths Pcs and Psc to the server.

Half of these probe trains are exchanged before noon, while the remaining half is exchanged in the evening. In our measurements, we vary the number of Open Flow switches that need to be configured in

Reaction to the exchanged probe packets. Namely, we consider the following four cases where a probe packets triggers the reconfiguration of some of the Open Flow switches:

(1) one hardware switch,

(2) two hardware switches,

(3) three hardware switches, and

(4) the software switch.

We remark that the choice of the configured hardware switches in our test bed (cf. Figure 1) has no impact on the measured features since we ensure that the remaining hardware switches have already matching rules installed.

Furthermore, we remark that packets of a probe train only traverse the software switch in case (4), i.e., when it is configured. In total, our data collection phase lapsed from April 27, 2015 until October 27, 2015, in which 869,201 probe packets were exchanged with our local server using all clients/configurations, amounting to almost 0.66 GB of data.

Development of Remote Monitoring

There is a need of software development for a remote monitoring of industrial machines with the features of analysis of the machine data to detect any abnormality in their operations and generating a report according to this analysis.

A set of software tools can be developed for this diagnostic system with the primary objective is to detect any incipient faults those are being cropped up and are becoming severe with time. Motor Current Signature Analysis (MCSA), sequence component analysis etc. types of tool can be used to detect any incipient fault.

The front end GUI (Graphical User Interface) displays the running conditions of the machines of this multi-machine monitoring system along with a list of maintenance schedule while the back end Management Information Base (MIB) controls all the data collection, interfacing of GSM modem for sending SMS for the faulty machine information and sends the control action for the machine(s).

Hybrid Data Compression scheme is can be used to make the system communication efficient as well as secured one. The developed system should be tested with two/three machine in the laboratory.

Remote Machine Monitoring

Remote machine monitoring is a process of periodically checking the status of the remote machine to enhance the overall products productivity.

The data sent by the remote machine monitoring system is very important for the managers and higher officials to analyses the production factors and also to solve issues such as production delay, production calamity, and improper maintenance as the breakdown takes a long time to get reported and fixed. This delay leads the devices to be unused for many days.

Such problems can be solved by this remote monitoring system which monitors the status of the machine from any locations, so that we can able to avoid the above said problems.

And this system also provides a clear visibility of the production directly to higher officials instead of relying on the supervisor bug report.

The remote machine monitoring system plays a major role in industrial application and they are to a large extent universally.

The potential benefits of remote monitoring are significant: minimizing labor costs, Remote service reduced need for on-site maintenance, Intelligent dispatching of service personnel based on diagnostic machine data, Less repeat repair visits due to missing spares, Improved machine uptime/utilization, Longer machine life due to preventive maintenance, Longer lifetime of critical components.

Tuesday, February 27, 2018

Auditing a Cloud provider's Compliance with Data Backup Requirements

DESCRIPTION

The new developments in cloud computing have introduced significant security challenges to guarantee the confidentiality, integrity, and availability of outsourced data. A service level agreement (SLA) is usually signed between the cloud provider (CP) and the customer.

For redundancy purposes, it is important to verify the CP's compliance with data backup requirements in the SLA. There exist a number of security mechanisms to check the integrity and availability of outsourced data.

This task can be performed by the customer or be delegated to an independent entity that we will refer to as the verifier. However, checking the availability of data introduces extra costs, which can discourage the customer of performing data verification too often.

The interaction between the verifier and the CP can be captured using game theory in order to find an optimal data verification strategy. In this paper, we formulate this problem as a two player non-cooperative game.

We consider the case in which each type of data is replicated a number of times, which can depend on a set of parameters including, among others, its size and sensitivity.

We analyze the strategies of the CP and the verifier at the Nash equilibrium and derive the expected behavior of both the players. Finally, we validate our model numerically on a case study and explain how we evaluate the parameters in the model.

WORK RELATED WITH CLOUD PROVIDER’S COMPLIANCE

It is important to verify the cloud provider’s compliance with the security requirements in the SLA. For example, Popa et al. designed a proof-based system to enable security guarantees in an SLA.

In recent years, a significant amount of data integrity schemes were proposed by different researchers, and have been gradually adapted to specific use cases such as outsourced databases and cloud computing, for which works focusing on public verifiability issues, were noticeably helpful and allowed clients to delegate the verification process to third parties.

Among these schemes, the two main directions explored by researchers include the Provable Data Possession (PDP) for ensuring possession of data, and the Proof of Retrievability (POR) for data possession and retrievability.

The main idea of PDP is that a data owner generates some metadata information for a data file to be used later for verification purposes.

Many extensions of this scheme managed to decrease the communication cost and complexity, as well as to allow dynamic operations on data such as insertion, modification, or deletion. Moreover, proposed PDP schemes specific to cloud computing.

WEB TRAFFIC ANALYSIS ATTACK

INTRODUCTION

A Web Traffic Analysis Attack Using Only Timing Information. introduce an attack against encrypted web traffic that makes use only of packet timing information on the uplink. This attack is therefore impervious to existing packet padding defences. we consider an attacker of the type illustrated in Figure 1.

The attacker can detect the time when packets traverse the encrypted tunnel in the uplink direction, but has no other information about the clients’ activity.

The attacker’s objective is to use this information to guess, with high probability of success, the web sites which the client visits.

What is distinctive about the attack considered here is that the attacker relies solely on packet timestamp information whereas the previously reported attacks against encrypted web traffic have mainly made use of observations of packet size and/or packetcount information.

Our interest in timing-only attacks is twofold. Firstly, packet padding is a relatively straightforward defence against attacks that rely primarily on packet size, and indeed is currently either already available or being implemented in a number of popular VPNs.

Secondly, alternative attacks based on packet counting are insensitive to packet padding defences but require partitioning of a packet stream into individual web fetches in order for the number of packets associated with each web fetch to be determined, which may be highly challenging in practice on links where there are no clear pauses between web fetches.

In contrast, packet timingbased attacks are not only largely unaffected by packet padding defenses but also, as we will show, do not require partitioning of the packet stream. Hence, they are potentially a practically important class of attack against current and future VPNs.

While some work has been carried out using inter-arrival time information to classify the application (HTTP, IMAP etc.) to our knowledge, there is no previous work reporting use of timing information alone to construct a successful attack against encrypted web traffic.

EXISTING SYSTEMS

An easy way to comply with the IJSRET journal paper formatting requirements is to use this document as a template and simply type your text into it.

The attacker can detect the time when packets traverse the encrypted tunnel in the uplink direction, but has no other information about the clients’ activity.

The attacker’s objective is to use this information to guess, with high probability of success, the web sites which the client visits.

The attacker relies solely on packet timestamp information whereas the previously reported attacks against encrypted web traffic have mainly made use of observations of packet size and/or packet count information.

Our interest in timing-only attacks is twofold. Packet padding is a relatively straight forward defense against attacks that rely primarily on packet size, and indeed is currently either already available or being implemented in a number of popular virtual private networks.

Alternative attacks based on packet counting are insensitive to packet padding defenses but require partitioning of a packet stream into individual web fetches in order for the number of packets associated with each web fetch to be determined, which may be highly challenging in practice on links where there are no clear pauses between web fetches.