Tuesday, February 27, 2018

SIGNATURE SEARCHING IN A NETWORKED COLLECTION OF FILES

Description

A signature is a data pattern of interest in a large data file or set of large data files. Such signatures that need to be found arise in applications such as DNA sequence analysis, network intrusion detection, biometrics, large scientific experiments, speech recognitionand sensor networks. Related to this is string matching. More specifically we envision a problem where long linear data files (i.e flat files) contain multiple signatures that are to be found using a  multiplicity of processors (parallel processor). This paper evaluates the performance of finding signatures in files residing in the nodes of parallel processors configured as trees, two dimensional meshes and hypercubes.We assume various combinations of sequential and parallel searching. A unique feature of this work is that it is assumed that data is pre-loaded onto processors, as may occur in practice, thus load distribution time need not be accounted for. Elegant expressions are found for average signature searching time and speedup, and graphical results are provided.


Introduction

A signature is a relatively small data pattern of interest embedded in a very large (in this paper sequential) data file. It is assumed signatures are temporally distinct and do not overlap each other. That is, there can be multiple signatures in a file. Because the files we study are much longer than the signatures, it is assumed that signatures have infinitesimally small length. Such signature searching occurs in network security, signal processing, medicine, image processing, and sensor technology and many other fields.

Conclusion 

It has been demonstrated that the expected search time for signatures in a wide variety of search scenarios for tree, mesh and hypercube networks, where load distribution time is not considered, can be calculated either analytically or through simulation. This should also be possible for other types of interconnection networks. Future research should consider other types of file structures or statistical assumptions. This work is of interest in a wide variety of applied areas involving signature searching.


No comments:

Post a Comment

Hybrid scheme of public-key encryption

Hybrid scheme of public-key encryption We introduce a hybrid homomorphic encryption that combines public-key encryption (PKE) and som...