반응형

[python] PyMuPDF 에서 해상도 올리기. PDF to IMG

How to Increase Image Resolution

 

https://pymupdf.readthedocs.io/en/latest/recipes-images.html

 

Images - PyMuPDF 1.23.25 documentation

Previous Text

pymupdf.readthedocs.io

The image of a document page is represented by a Pixmap, and the simplest way to create a pixmap is via method Page.get_pixmap().

This method has many options to influence the result. The most important among them is the Matrix, which lets you zoom, rotate, distort or mirror the outcome.

Page.get_pixmap() by default will use the Identity matrix, which does nothing.

In the following, we apply a zoom factor of 2 to each dimension, which will generate an image with a four times better resolution for us (and also about 4 times the size):

zoom_x = 2.0  # horizontal zoom
zoom_y = 2.0  # vertical zoom
mat = fitz.Matrix(zoom_x, zoom_y)  # zoom factor 2 in each dimension
pix = page.get_pixmap(matrix=mat)  # use 'mat' instead of the identity matrix

dpi = 600
pix = page.get_pixmap(dpi)

Since version 1.19.2 there is a more direct way to set the resolution: Parameter "dpi" (dots per inch) can be used in place of "matrix". To create a 300 dpi image of a page specify pix = page.get_pixmap(dpi=300). Apart from notation brevity, this approach has the additional advantage that the dpi value is saved with the image file – which does not happen automatically when using the Matrix notation.

 
반응형
반응형

vscode-pdf

 

Display pdf file in VSCode.

 
반응형
반응형

[python] pdf to png, 해상도 높게 저장하기 

 

import fitz  # PyMuPDF

def pdf_to_png(pdf_file, output_folder, dpi=300):
    # Open the PDF file
    pdf_document = fitz.open(pdf_file)
    
    for page_number in range(pdf_document.page_count):
        # Get the page
        page = pdf_document[page_number]
        
        # Set the resolution (DPI)
        zoom = dpi / 72.0
        mat = fitz.Matrix(zoom, zoom)
        image = page.get_pixmap(matrix=mat)
        
        # Save the image as a PNG file
        image.save(f"{output_folder}/page_{page_number + 1}.png", "png")

    # Close the PDF file
    pdf_document.close()

if __name__ == "__main__":
    input_pdf = "input.pdf"  # Replace with your PDF file path
    output_folder = "output_images"  # Replace with your output folder
    dpi = 600  # Adjust DPI as needed
    
    pdf_to_png(input_pdf, output_folder, dpi)
반응형

'프로그래밍 > Python' 카테고리의 다른 글

[python] PyAudio  (0) 2023.10.20
[Python] savefig 0.0.4  (0) 2023.10.17
[python] matrix 3.0.0  (0) 2023.10.04
[python] 알고리즘 - 탐색  (0) 2023.09.27
[python] 알고리즘 - 정렬  (0) 2023.09.27
반응형

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

https://pypi.org/project/PyMuPDF/

>> pip install PyMuPDF

 

https://mupdf.com/

 

1.디렉토리 안의 pdf 파일을 읽어들여서 리스트 목록을 출력

2.파일명을 넘기면 파일명_이미지순서.png 파일을 생성. 

import fitz  # PyMuPDF

 
# 파이썬 컴파일 경로가 달라서 현재 폴더의 이미지를 호출하지 못할때 작업디렉토리를 변경한다. 
import os
from pathlib import Path
# src 상위 폴더를 실행폴더로 지정하려고 한다.
###real_path = Path(__file__).parent.parent
real_path = Path(__file__).parent
print(real_path)
#작업 디렉토리 변경
os.chdir(real_path) 

directory_base = str(real_path)+"./ONE/"  # 경로object를 문자열로 변경해서 합친다. 
 


def pdf_to_png(pdf_file, input_pdf_name, output_folder):
    # Open the PDF file
    pdf_document = fitz.open(pdf_file)
    
    for page_number in range(pdf_document.page_count):
        # Get the page
        page = pdf_document[page_number]
        
        # Convert the page to an image
        image = page.get_pixmap()
        
        # Save the image as a PNG file
        image.save(f"{output_folder}/{input_pdf_name}_{page_number + 1}.png", "png")

    # Close the PDF file
    pdf_document.close()

if __name__ == "__main__":
     
    # List all files in the directory
    file_list = [f for f in os.listdir(directory_base) if os.path.isfile(os.path.join(directory_base, f))]

    # Print the list of files
    for file in file_list:
        print(file)
        
        #input_pdf = "./TWO/"+ file_name +".pdf"  # Replace with your PDF file path
        input_pdf      = "./ONE/"+ file  # Replace with your PDF file path
        input_pdf_name = os.path.splitext(file)[0]
        print(input_pdf_name)
        output_folder  = "./ONE/data"  # Replace with your output folder
        
        pdf_to_png(input_pdf, input_pdf_name, output_folder)

 

반응형

'프로그래밍 > Python' 카테고리의 다른 글

[python] pyperclip  (0) 2023.09.18
[Python] kivy  (0) 2023.09.15
[python] PyMuPDF로 코딩 없이 PDF에서 이미지 추출  (0) 2023.09.14
[python] cowsay  (0) 2023.09.14
[PYTHON] Python tkinter 강좌  (0) 2023.08.25
반응형

PyMuPDF로 코딩 없이 PDF에서 이미지 추출

 

https://wikidocs.net/181972

 

PyMuPDF로 코딩 없이 PDF에서 이미지 추출

[PyMuPDF](https://github.com/pymupdf/PyMuPDF)의 fitz를 이용해 PDF 파일에서 이미지를 추출할 수 있다. [명령행 모듈](https://p…

wikidocs.net

# PyMuPDF로 코딩 없이 PDF에서 이미지 추출


# PyMuPDF


# pip install PyMuPDF



import fitz
doc = fitz.open(PDF_FILE_PATH)
for i, page in enumerate(doc):
    img = page.get_pixmap()
    img.save(f"./data/{i}.png")



# Command 로 바로 실행하기 
# python -m fitz extract -images input.pdf
반응형

'프로그래밍 > Python' 카테고리의 다른 글

[Python] kivy  (0) 2023.09.15
[python] PDF to png, import fitz , PyMuPDF  (0) 2023.09.15
[python] cowsay  (0) 2023.09.14
[PYTHON] Python tkinter 강좌  (0) 2023.08.25
[python] chatGPT에게 TicTacToe 만들어달라고 했다.  (0) 2023.08.21

+ Recent posts