DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • Jakarta EE 12: Entering the Data Age of Enterprise Java
  • Zero-Downtime Deployments for Java Apps on Kubernetes
  • Rethinking Java CRUDs With Event Sourcing and CQRS Patterns
  • Detecting Bugs and Vulnerabilities in Java With SonarQube

Trending

  • AI Agents in Java: Architecting Intelligent Health Data Systems
  • Building a Zero-Cost Approval Workflow With AWS Lambda Durable Functions
  • Detecting Bugs and Vulnerabilities in Java With SonarQube
  • Ujorm3: A New Lightweight ORM for JavaBeans and Records
  1. DZone
  2. Coding
  3. Java
  4. Determining a File Type In Java

Determining a File Type In Java

By 
Ercan Zengin user avatar
Ercan Zengin
·
Oct. 05, 20 · Tutorial
Likes (3)
Comment
Save
Tweet
Share
35.0K Views

Join the DZone community and get the full member experience.

Join For Free

In most applications, we need to download and upload file features. During these downloads and uploads, we sometimes need to specify the format of the file, or we need to be sure that the file has the same format with the format that was chosen by the user. For these needs, we can use several approaches in Java. We will list these approaches in this article.

1. Files.probeContentType(Path)

With the probeContentType method of the java.nio.file.Files class that came with Java 7, we can get the type of the file that we gave in the path name which we passed in as the parameter to the method.

Below, we gave the name of the example file JPG_Test_File.jpg to getFileTypeByProbeContentType method. In this method, we call Files.probeContentType(Path) method, and we get image/jpeg as the file type.

Java
xxxxxxxxxx
1
19
 
1
public class FileTypeDetection {
2
3
    public static String getFileTypeByProbeContentType(String fileName){
4
       String fileType = "Undetermined";
5
       final File file = new File(fileName);
6
       try{
7
          fileType = Files.probeContentType(file.toPath());
8
       }
9
       catch (IOException ioException){
10
          System.out.println("File type not detected for " + fileName);
11
       }
12
       return fileType;
13
    }   
14
    
15
    public static void main(String[] args) {
16
        System.out.println(getFileTypeByProbeContentType("JPG_Test_file.jp"));
17
    }
18
}


Output: image/jpeg

But, if we only change the extension of the file and make it PPTX and give the new file name as parameter to the same method we don’t get the same result:

Output: application/vnd.openxmlformats-officedocument.presentationml.presentation

If we rename the file and remove the extension completely, we couldn’t get a file type by the same method.

Output : null

2. MimetypesFileTypeMap.getContentType(String)

We can use the file's name and pass it to the getContentType method of MimetypesFileTypeMap class came with Java 6 in order to get the file type.

Here is our method:

Java
xxxxxxxxxx
1
 
1
public static String getFileTypeByMimetypesFileTypeMap(final String fileName){    
2
    final MimetypesFileTypeMap fileTypeMap = new MimetypesFileTypeMap();
3
    return fileTypeMap.getContentType(fileName);
4
}


If we call this method for the file we changed the extension of to PPTX, we get the following result as file type:

Output: application/octet-stream

3. URLConnection.getContentType()

With the getContentType method of the URLConnection, class we can get content type of a file. 

Java
xxxxxxxxxx
1
16
 
1
    public static String getFileTypeByUrlConnectionGetContentType(final String fileName){
2
       String fileType = "Undetermined";
3
       try{
4
          final URL url = new URL("file://" + fileName);
5
          final URLConnection connection = url.openConnection();
6
          fileType = connection.getContentType();
7
       }
8
       catch (MalformedURLException badUrlEx){
9
          System.out.println("ERROR: Bad URL - " + badUrlEx);
10
       }
11
       catch (IOException ioEx){
12
          System.out.println("Cannot access URLConnection - " + ioEx);
13
       }
14
       return fileType;
15
    }


If we call this method for the file that we changed the extension of to PPTX, we get the following result as file type:

Output: content/unknown

4. Apache Tika

Previous three approaches are provided by the JDK. However, there are others like Apache Tika. Apache Tika is a very successful library and is good at detecting file type via analyzing file content independently of its extension.

Our method gets InpustStream as parameter and uses detect method of Apache Tika:

Java
xxxxxxxxxx
1
14
 
1
public static String getFileTypeByTika(InputStream istream) {
2
        
3
final Tika tika = new Tika();       
4
    String fileType ="";
5
    try {
6
        fileType = tika.detect(istream);
7
    } catch (IOException e) {
8
        System.out.println("*** getFileTypeByTika - Error while detecting mime type from InputStream ***");
9
        System.out.println("*** getFileTypeByTika - Error message: " + e.getMessage());
10
        e.printStackTrace();
11
    }
12
    return fileType;
13
}


If we convert the file that was originally in JPEG format to FileInputStream, but we change the extension of it to PPTX and give it as a parameter to getFileTpyeByTika, we get the following result:

Output: image/jpeg

Tika detected the type of file correctly.

We can use the detect method of Apache Tika with parameter has type of File, instead of InputStream. We will use following method to use the detect method of Tika with File parameter:

Java
xxxxxxxxxx
1
13
 
1
public static String getFileTypeByTika2(File file) {        
2
    final Tika tika = new Tika();       
3
    String fileTypeDefault ="";
4
    try {
5
        fileTypeDefault = tika.detect(file);
6
    } catch (IOException e) {
7
        System.out.println("*** getFleeTypeByTika2 - Error while detecting file type from File ***");
8
        System.out.println("*** getFileTypeViaTika2 - Error message: " + e.getMessage());
9
        e.printStackTrace();
10
    }
11
    return fileTypeDefault;
12
}


If we provide the file that was originally in JPEG format, but we change the extension to PPTX again, Tika will detect the file type correctly:

Output: image/jpeg

As we can see, Apache Tika can detect the file type correctly despite the change of the file extension. We can use Apache Tika when the file type is crucial or file type can effect the flow of an application.

Java (programming language) Apache Tika

Opinions expressed by DZone contributors are their own.

Related

  • Jakarta EE 12: Entering the Data Age of Enterprise Java
  • Zero-Downtime Deployments for Java Apps on Kubernetes
  • Rethinking Java CRUDs With Event Sourcing and CQRS Patterns
  • Detecting Bugs and Vulnerabilities in Java With SonarQube

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook