Search This Blog

Thursday, 21 May 2020

Summary of the Structure of PDF files

PDF can be looked upon as a combination of different file types presented in a single container. The reason for this is that a PDF file contains Text, vector art, images, fonts and other file format can be embedded - even the native files that were used to create the PDF in the first place.

An object orientated file format with were items can be connected directly or indirectly to each other. 



The objects within a PDF file can be divided into the following types:

Dictionaries

A group containing direct or references to indirect objects. Dictionaries can be seen as the glue holding together the elements in a PDF files. The example below shows the structure of a typical page dictionary:



The Contents stream has an attributes dictionary that contains a filter name and the length of the stream
The CropBox array contains the coordinates of the rectangle that defines the area that is visible on the page.
The MediaBox array contains the coordinates of the rectangle that defines the media size. This will typically match a standard media size such as Letter or A4 and will allow the PDF page to be reliably printed on a device that contains these standard media sizes.
The Resources dictionary contains references and information for elements that are needed to reliably output the visual elements of the page such as colors, fonts and Images.
 
Streams

The collection of operators outputting information onto the page. Normally the stream will also require elements of the page resources dictionary such as colors and fonts. Streams are either stored as a single element or in an array.

q
567.48 61.011 -540 720 re
W* n
q
/GS0 gs
0 720 -541.1399536 0 567.4799194 61.0105438 cm
/Im0 Do
Q
Q
/CS0 cs 0.302 0.302 0.302  scn
1 i 
/GS1 gs
56.7 286.911 m
56.7 295.191 56.7 303.471 56.7 311.751 c
59.1 311.751 61.5 311.751 63.9 311.751 c
63.9 306.831 63.9 301.911 63.9 296.991 c
65.88 296.991 67.8 296.991 69.72 296.991 c
69.72 301.191 69.72 305.391 69.72 309.591 c
72 309.591 74.22 309.591 76.5 309.591 c
76.5 305.391 76.5 301.191 76.5 296.991 c
81.06 296.991 85.62 296.991 90.18 296.991 c
90.18 293.631 90.18 290.271 90.18 286.911 c
79.02 286.911 67.86 286.911 56.7 286.911 c
f*

You can see that there are several references to items in the page resources dictionary:
GS0 is a reference to a graphics state and gs is the operator that sets it.
Im0 is an XObject image and the Do operator draws the image.
CS0 is a reference to a color dictionary and the scn operator assigns it to strokes.

You can also see usage of several path operators re - rectangle, m - moveto, c - curve f* - fill.

Text strings

These can either be ANSI (single byte characters) or Unicode (multi-byte). The example here is the representation of the last date modified in the catalog dictionary.





Images

Images are normally held within the page resources and the stream will also have an associated Attributes dictionary that will describe the attributes of the data within the stream. BitsPerComponent size of the data that is used to define a single pixel (dot) within the image. The ColorSpace dictionary describes the colour model that is used to define the colors within the image.



Names

Used normally to provide a name that can be used to refer to a dictionary or dictionary item. For example, the pages dictionary has a name "Type" with the value "Pages" and a single page has a name of "Type" with a value of"Page".




Arrays

Fixed length data holding types and/or references to other elements. For an example see the Real Numbers example below.

Real numbers

Decimal numbers. In this example they are being used to define the rectangle of the page media box:


Integers

Whole numbers. For example to show the total number of  pages in the PDF file.



For further details see the PDF Specification at https://www.adobe.com/devnet/pdf/pdf_reference.html

Contact:

Michael Peters

Wednesday, 20 May 2020

Understanding of Colour and Colour models

There are a number of color  models but I am only going to cover 2 here as they are the most often used. 

RGB

This color model is primarily used to describe light. It is used mainly in cameras and scanners. It has 3 color elements that when added together at 100% represent white or pure light. The 3 different colors are Red, Green and Blue. The color model is almost infinite in its range and this in itself is ok until printing is required and that printing is being done through the CMYK color model. The model uses 3 values with each being in a range between 0 and 255 as in the Windows and applications such as Photoshop or as a decimal number up to a maximum of 1 in PDF for example. 

RGB is an additive color model. Adding all of the colors in equal amounts will result in white.

RGB Colour merge
In the web world RGB colours are represented by hex number combinations (the numbering system is ). So for example Red would be #FF0000, Green would be #00FF00 and Blue would be #0000FF. Black is #000000 and White is #FFFFFF. 

CMYK

Cyan/Magenta/Yellow/Knockout used primarily in printing.

The colors are created by printing the colors on top of each other to achieve the required shades. There may may overlaps required on the edges (trapping) to ensure that spaces are not seen as different paper types can expand and shrink when the ink/toner is applied. The color model is much more limited in its range than RGB and therefore care needs to be taken when converting from RGB to CMYK. This can be achieved through color management systems, adding additional colors to the printing run (such as Hexachrome) or using Spot colors that are usually already mixed colors such as Pantone. Printing is effected bu the resolution of the input and output and the paper stock that is being used to print onto both in the surface quality and base color of the media type and also the attributes of the inks being used. Additionally output effects and colors can be modified and enhanced through varnishes such as UV and foils to provide metallic effects.

The model uses 4 values each as a percentage of the 

CMYK is a subtractive color model. Adding all of the colors in equal amounts results in black. However in CMYK this will more than likely result in a dirty color and so with the addition of the K in CMYK the printers also have a real black in order to print a true black.

CMYK Colour merge

This is a simple look at color and I will expand on this in a future blog.

Contact info:

Michael Peters

Tuesday, 19 May 2020

What is an Acrobat Plug-in?


A way for software developers to add additional functionality to Acrobat or to modify current functionality.

Why are plug-ins required?

Adobe provides a product that is intended to be used across multiple industries and organisations. Supporting all multiple vertical markets bloats the application in proving features that would only be used by relatively few people when compared with the whole Acrobat market.

Can Acrobat plug-ins be used in the Adobe Reader?

Special support needs to be added to the plug-in so that it can run under Adobe Reader. However the Reader plug-in will require a special license and needs to go through an approval process with Adobe Systems Inc. - https://www.adobe.com/devnet/reader/ikla.html.

Are plug-ins specific to a particular version of Adobe Acrobat?

We have plug-ins that we developed for Acrobat 6 that still run without modification in Acrobat DC. However, if new features are used that are specific to a later version then it won't work under later versions. If earlier versions used the Adobe Dialog Manager (ADM) then they won't now work in current versions of Acrobat.

Examples of Plug-ins
  • New security handlers that might be specific to a particular organisation. For example, we have developed security handlers that do not allow PDF files to be viewed outside a particular organisations offices. 
  • New annotations. For example, we created a plug-in that supported all of the British Standard Markups.
  • Flattening annotations and form fields into the main document. This ensured that they could not be changed or modified and that they would print as part of the document even if the printing of annotations was switched off.
  • Adding text and images to PDF files.
  • Creating a table of Contents for PDF files
  • Adding fields for variable data printing
  • Hardware integration of Adobe Acrobat into whiteboards and interactive tables

Contact Info:

Michael Peters

Tuesday, 11 February 2020

PDF Software Development Beyond Acrobat

The Adobe PDF Library

The Adobe PDF library can be seen basically as a software developer version of Adobe Acrobat but without the user interface however, it is far more than that.




Platform availability

Acrobat is 32 bit application on Windows and is also available for the Mac however the PDF library is available on far more platforms and also as a 64-bit offering. The library is available for the following platforms:
  • Windows 32-bit
  • Windows 64-bit
  • Mac 32-bit
  • Mac 64-bit
  • Linux 32-bit
  • Linux 64-bit
  • Solaris Sparc 32-bit
  • Solaris Sparc 64-bit
  • Solaris Intel 32-bit
  • Solaris Intel 64-bit
  • AIX 32-bit
  • AIX 64-bit
  • HP/UX PA-RISC 32-bit
  • HP/UX PA-RISC 64-bit
  • HP/UX Itanium 32-bit
  • HP/UX Itanium 64-bit

Where is it used?

The library is built into Adobe's CC (Creative Cloud) applications such as Adobe InDesign, Adobe PhotoShop and Adobe Illustrator and is available to third-party companies as an OEM offering.

Our licensing and experience

Mapsoft has extensive experience with working with the Adobe PDF library and the Adobe Acrobat SDK (software developers kit).

Mapsoft licences the library for use in our applications and we have extensive experience in using it for other organisations offerings. The software developer kit is the same code base in both Adobe Acrobat/Adobe Reader and the PDF library. In some cases the same product that is available as a plug-in for Acrobat can have most of its code reused for an Adobe PDF library offering. However the advantages is that library offerings are not constrained by the same licensing as Adobe Acrobat in particular in being able to use it in a server environment. Datalogics, who license the PDF library have created their own interfaces now that can be used through .NET and through Java.

In general the PDF library is kept in step with both Acrobat and changes in  PDF and recently with the jump from PDF version 1.7 to PDF version 2. There is also extensive support for PDF X and PDF A and the ability to be able to convert from generic PDF to these specific versions.

The PDF library is not an end user product. It is for use by developers and is capable of interacting with PDF files to a very low level. Although PDF is an ISO standard there is the huge confidence of choosing a software developers kit that has been created by the originators of the PDF standard and used extensively in products from Adobe rather than choosing a 3rd party offering.

Datalogics who are responsible for licensing the PDF library to third-party software developers also provide support and maintenance and licensing of the binaries where development is not required because that service has been provided by another organisation such as Mapsoft.

For more information please contact Michael Peters who is the Technical Director at Mapsoft.

Michael Peters
mpeters@mapsoft.com
www.mapsoft.com

for more information on the Adobe PDF library please see the Datalogics website at:  https://www.datalogics.com/products/pdf/pdflibrary/



Wednesday, 18 December 2019

DiamondTouch Table and Adobe Acrobat Integration



The worlds first multi-user touch table and a plug-in developed by Mapsoft running under Adobe Acrobat.


Contact:

Michael Peters
mpeters@mapsoft.com
https://www.linkedin.com/in/mpmapsoft/
http://www.mapsoft.com

Developing User Interfaces for Adobe Products


Choices when developing extensions and plug-ins for Adobe Creative Cloud and Document Cloud products

Most of Adobes Products has either come from other companies that have merged with Adobe Systems or have been in development for a considerable length of time. There is more than one way of creating extensions for most of Adobes products and especially the variation in developing graphical user interfaces. 

Adobe Illustrator, for example requires the use of HTML and JavaScript for its user interfaces. Adobe Acrobat supports native user interfaces on both Windows and Mac platforms although until recent versions that used to have its own user interface technology called ADM (Adobe Dialog Manager). The same technology was also used in Adobe Illustrator, however over the last few versions of the products this technology has been removed. For Adobe Acrobat there is an example plug-in using wxWidgets (https://www.wxwidgets.org/) an open source library which can be used across both Windows and Mac platforms but unfortunately this just means having to educate developers on yet another user interface technology, Adobe Indesign probably provides most options in supporting its own resource format and also HTML5.

Extensions for a lot of Adobes Creative Cloud products can also is be supplied as hybrid extensions and plug-ins using HTML JavaScript and also native plug-ins within the same package and product. This array of different technologies has not been made any easier by the fact that over the last years Adobe has created technologies and then consigned them to history. At one point the most of  the Creative Suite (precursor to the Creative Cloud) products supported user interfaces based on Flash and then that was discontinued in preference to using HTML 5, CSS 3 and JavaScript and is known as the Adobe common extensibility platform (CEP). Adobe Acrobat, a part of both the Document Cloud and the Creative Cloud follows its own development path in not using this technology.

Location of Adobe's current software developer kits and and CEP

Adobes SDKs (software development kits) are available from various links, but the most up-to-date set are available from Adobe.io for the latest iterations of Adobe products.

Beyond the GUI

Of course is a lot more to a plug-in or extension product then the user interface but generally this is the main element aiding or constraining cross-platform software development and of course the main part of a product that the user interacts with. The Adobe CEP interface is a big move to try and remove these constraints and to try and provide a consistent GUI across multiple products. Unfortunately one of our main development platforms is still Adobe Acrobat and this technology has not yet been adapted for this product.

CEP is advertised as a platform for those not wanting to use C++. For many products this is certainly the case that there are definitely constraints where a decision has to made if enough of the developer interface has been exposed to produce the product that is required. This will vary from product to product. However for interaction, speed, multi-threading and extensive task handling using the software developer kits natively is often the only option. An example of this is our integration of SMART boards and Adobe Illustrator where the CEP technology just wasn't sufficient. However CEP does have support for Node.js as this will bring in an extensive number of JavaScript libraries. However most of these applications are client based and so this is generally only useful in a client server type environment and the plug-in interfaces are nearly all C++ for which there is also a very extensive (probably more extensive) set of libraries that can be used.

Contact:

Michael Peters
mpeters@mapsoft.com
https://www.linkedin.com/in/mpmapsoft/
http://www.mapsoft.com

Tuesday, 17 December 2019

SMART integration with Adobe Illustrator using an Adobe Illustrator Plug-in

SMART integration with Adobe Illustrator using an Adobe Illustrator Plug-in


Enhance design & product review meetings by using SMART visual collaboration solutions with the popular vector graphics editor Adobe Illustrator.

Technologies Used

SMART White Board

Adobe Illustrator

Adobe Illustrator SDK

Contact Information

Michael Peters

https://www.linkedin.com/in/mpmapsoft/

mpeters@mapsoft.com

http://www.mapsoft.com