Hey guys! In this article I’m going to tell you what is file signature and how can you code it. First of all , let’s start with what is file signature.
What is the File Signature?
The signature is unique mark for a file. The structure of a file normally consist of:
- Filename
- File Header/Footer
- File Content
The file signature is located in the file header. Also file header section has tons of information which are quite useful for the examiner. Every file has a unique signature. For an instance pdf files starts with a value and microsoft word files starts with another value. We’ll look at it by doing a practice.

1-File Signature of a PNG File

2-File Signature of a DOCX File
As you can see above , two file types and two different file signatures.
If you look a file, in the program which help you to see it on hex, you can see the file is starting with a value. It’s not certain how many bytes it takes but for known file types signatures are standardized. For an example in the investigation you are searching a jpg file, but there is no jpg file when you look it. In this part, file signature analysis will step in and say it’s my job get out of here.
And we understand here why it’s important for digital forensics. The bad guy could hide the file which is important for investigation by changing its extension. But if he or she doesn’t know file signature, you can find what it has inside. I’m going to keep it short. Let’s look at the code.
The Code Section 1
First of all, before you code it, you must have a database of files signatures. If you don’t have let’s start by gathering them. If you have you can jump to second section.
/*
* To change this license header, choose License Headers in Project Properties.
* To change this template file, choose Tools | Templates
* and open the template in the editor.
*/
package com.hex;
import java.io.IOException;
import java.sql.PreparedStatement;
import java.sql.SQLException;
import java.util.ArrayList;
import java.util.List;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
/**
*
* @author VV
*/
public class get extends DataBase {
PreparedStatement preparedstatement;
public static void main(String[] args) throws SQLException {
String url = null;
get get = new get();
List hex = new ArrayList();
List description = new ArrayList();
List extension = new ArrayList();
for (int i = 1; i <= 18; i++) {
url = "https://www.filesignatures.net/index.php?page=all¤tpage=" + i + "&order=EXT&alpha=All";
Document html = get.gethtml(url);
Element table = html.getElementById("innerTable");
Elements spans = table.getElementsByTag("td");
for (int j = 5; j < spans.size(); j++) {
if (j % 4 == 1) {
System.out.println("Extension:" + spans.get(j).text().trim());
hex.add(spans.get(j).text().trim());
} else if (j % 4 == 2) {
System.out.println("HEX:" + spans.get(j).text().trim());
description.add(spans.get(j).text().trim());
}
if (j % 4 == 3) {
System.out.println("Description:" + spans.get(j).text().trim());
extension.add(spans.get(j).text().trim());
}
}
}
for (int j = 0; j < hex.size(); j++) {
boolean check = get.insertLinks(hex.get(j).toString(), description.get(j).toString(), extension.get(j).toString());
}
}
public static Document gethtml(String url) {
Document control = null;
try {
control = Jsoup.connect(url).get();
} catch (IOException exception) {
control = null;
if (control == null) {
}
} finally {
return control;
}
}
public boolean insertLinks(String hex, String description, String extension) throws SQLException {
boolean control = false;
connection_open();
String query = " insert into signaturedb(hex,description,extension) values(?,?,?)";
preparedstatement = connection.prepareStatement(query);
preparedstatement.setString(1, hex);
preparedstatement.setString(2, description);
preparedstatement.setString(3, extension);
preparedstatement.executeUpdate();
control = true;
connection_close();
return control;
}
}
Here I’ve found for creating database: https://www.filesignatures.net/index.php?page=all&order=SIGNATURE&sort=DESC&alpha=All
The main thing is the algorithm, code may not run on your computer. Because there are requirements like jsoup, tables and right columns etc. But I believe you can handle. I won’t explain this code. If you don’t wanna code this in your own way, click the link to download db https://github.com/osmaoguzhan/signatureAnalysis/blob/master/signaturedb.sql
The Code Section 2
Yes. We are here where the rubber meets the road. Ok let’s start main point of analysis.
//In this function we got the path of files or a file and send it to //converttohex function
public void signature(String path) {
String extension;
StringBuilder hex = null;
File folder = new File(path);
File[] files = folder.listFiles();
if (files.length == 0) {//If folder doesn't have any files
System.out.println("This folder is empty. Please choose a folder that is not empty!!");
} else {
for (File file : files) {
String newPath = path + "\\" + file.getName();
extension = newPath.substring(newPath.lastIndexOf(".") + 1);
//above we got it's extension
System.out.println("FILE NAME IS " + file);
try {
hex = convertToHex(new File(newPath));
//above we got it's hex value
if (!extension.equalsIgnoreCase("txt")) {
//if it's not txt send it getsignature func.becuase txt doesn't have any signature.
getSignature(hex, extension, file.getName());
}
} catch (IOException ex) {
Logger.getLogger(hex.class.getName()).log(Level.SEVERE, null, ex);
}
newPath = path;
}
}
}
public static StringBuilder convertToHex(File file) throws IOException {
InputStream is = new FileInputStream(file);
int bytesCounter = 0;
int value = 0;
StringBuilder sbHex = new StringBuilder();
StringBuilder sbResult = new StringBuilder();
while ((value = is.read()) != -1) {
sbHex.append(String.format("%02X ", value));
if (bytesCounter == 15) {
sbResult.append(sbHex).append("\n");
sbHex.setLength(0);
bytesCounter = 0;
} else {
bytesCounter++;
}
}
if (bytesCounter != 0) {
for (; bytesCounter < 16; bytesCounter++) {
sbHex.append(" ");
}
sbResult.append(sbHex).append("\n");
}
StringBuilder deneme = sbResult;
is.close();
return deneme;
}
Above you see the code which is converting files to hex. This is quite necessary to analyze signature. So after we convert file to hex, we’ll have a data that looks like first pic. After we got extension and hex value, almost done. Now we’ll get signatures and real extensions from db and check whether they match or not.
public void getSignature(StringBuilder hex, String extension, String file) {
//The func. takes 3 variable. The file's hex value,the filename extension //which we don't know it's true or not and the file
List description = new ArrayList();
List hexDB = new ArrayList();
List extDB = new ArrayList();
try {
connection_open();
String query = "select hex,description,ext from signaturedb";
preparedstatement = connection.prepareStatement(query);
ResultSet rs = preparedstatement.executeQuery();
while (rs.next()) {
hexDB.add(rs.getString("hex"));
description.add(rs.getString("description"));
extDB.add(rs.getString("ext"));
}
//above we got db columns into arraylists.
} catch (SQLException e) {
Logger.getLogger(hex.class.getName()).log(Level.SEVERE, null, e);
} finally {
connection_close();
}
match(hexDB, extDB, description, hex, extension, file);
//and send arraylists to match func with our file's informations.
}
public void match(List hexDB, List extDB, List description, StringBuilder hex, String ext, String file) {
int counter = 0;
for (int i = 0; i < hexDB.size(); i++) {
//check real hex value length because we have tons of hex values of a file not //just a signature
//And below we are parsing the long hex value to real signature length. For an //instance if the signature's length is 4, after parsing we'll get first 4 hex //values
String control = hex.substring(0, hexDB.get(i).toString().length());
// We checked the length. And now we are comparing real and normal hex values.
if (control.equalsIgnoreCase(hexDB.get(i).toString())) {
if (!extDB.get(i).toString().equalsIgnoreCase(ext)) {
System.out.println("\u001b[41mDoesn't Match!!");
System.out.println("\u001b[41mReal extension :" + extDB.get(i));
} else {
System.out.println("\u001b[42mEverything is OK!! There is no manipulation!!");
System.out.println("----------------------------------------------");
}
} else {
counter++;
}
//
if (counter == hexDB.size()) {
System.out.println("\u001b[41mThe signature couldn't found on DB!!");
System.out.println("--------------------------------------------------");
}
}
}
Above we have 3-4 steps:
- Find the hex value and filename extension of a file that you wanna examine.
- Compare the real hex with the file hex you wanna examine.
- Compare the real extension with the file extension you wanna examine.
- If it’s all okey then it’s matched.
That’s enough for now. If you wanna ask something about code. Please let me know. Also you can find the project here. See you soon!!
