CAPíTULO IV. RESULTADOS
4.7 Análisis de Caso
4.7.1 Caracterización de tres territorios a nivel regional
There are a number of commercial solutions, applications, and vendors who will, for a high price, render ESI into a paginated format. The current most commonly used solution is an application called Law Pre- Discovery (“LAW”). As this represents, essentially, an industry standard I will use it for comparison and analysis of problems inherent in existing approaches. It is important to understand the costs associated with these platforms, their performance and scalability, and the potential impact it has on smaller law firms or solo practitioners.
The pricing structure for LAW is represented in table 9.2.1 and table 9.2.2 on the next page. The pricing structure is an annual subscription with discounted pricing for multi-year licensing agreements. For the purposes of considering conversion of ESI from native electronic format to a paginated format, LAW requires the Admin4, ED Loader5, and TIFF Conversion modules6. This minimum functionality for a single
station will cost between $4,013-5,457 annually depending on license agreement length.
The three modules required to perform the base minimum conversion to paginated format leave large
4Allows for base functions in creating and managing case files. 5Allows electronic information to be imported into a case. 6Allows batch conversion of electronic files to TIFF format.
Module 1-Year Pricing 2-Year Pricing 3-Year Pricing Scan-Unlimited $3,086 $2,547 $2,371
LAWtsi Scan $3,086 $2,547 $2,371 ED Loader $3,086 $2,547 $2,371 OCR (ABBYY Fine Reader) $2,008 $1,646 $1,476 QC/Edit $1,367 $1,125 $1,004 Tiffing $1,367 $1,125 $1,004 Endorse $1,004 $823 $738 OCR $1,004 $823 $738 E-Print $1,004 $823 $738 Print $1,004 $823 $738 Full-Text Indexing $1,004 $823 $738 Admin $1,004 $823 $738 Searchable PDF $1,004 $823 $738 Table 9.2.2: LAW PreDiscovery Module Pricing Structure
functionality gaps including an inability to print, bates number, perform quality control / edit functions, or OCR the resulting TIFF images. A more realistic option is the EDD Premium bundle which includes the necessary modules to perform these additional function. The cost ranges from $6,010-8,168 depending on license duration. This is, again, for a single work station. After obtaining the base package, the system scales at a cost of between $2,480-4,379 per node per year depending on license agreement length and which OCR module is used.
The cost of the processing platform directly effects the cost of electronic discovery specifically and liti- gation generally. Let us suppose a given node may process, on average, g gigabytes of native ESI per day.
Assuming perfect scalability, the minimum cost of a corpuscgigabytes in size with a deadline ofddays can
be expressed by first solving the equation c
gd =nto determine how many nodesdneare required to process c gigabytes of ESI in ddays if each node can process g gigabytes per day. The minimum cost, assuming
perfect distributed processing and cheapest licensing per annum is$6,010 + (dne −1)∗$2,480. Under this
formulation, the minimum hardware/software necessary to support processing will vary based on the job size and deadline length which must be supported. Using this we can establish a pricing floor per gigabyte in an ideal, but not obtainable, configuration. By setting d = 1 we can then set how many gigabytes we
wish to process per dayc, and determine the minimum pricing based on how many gigabytes per day each
individual node achieves g. Thus, for daily capacity c at per node rate g our minimum cost per gigabyte
assuming perfect distribution and processing at 100% utilization over a year would be $6,010+(dcge−1)∗$2,480
365
per gigabyte to break even on the LAW licensing costs.
Software alone will not a process make. The cost of hardware must also be accounted for as it scales equally the node increase assuming, for simplicity, that virtualization is not used and a 1 : 1 relationship
$6,010+(dc
ge−1)∗$2,480+(dgce∗$1,600)
365
Figure 9.2.1: Ideal Per Unit Daily Capacity Cost Scalability
breakdown costs over five years will run approximately $8,000 or $1,600 per year. Adding this to our equation we arrive at the following formula for node scalability in figure 9.2.1.
9.2.1
Pagination Architecture
What exactly does LAW do when it processes native ESI into a paginated format? LAW utilizes the Tagged Image File Format (“TIFF”) for pagination. It does this by creating a virtual printer within Windows’ print system7. Using the three minimum modules I discussed supra, the user creates a new case via the Admin
module, then imports the native ESIvia the ED Loader module. With the native data loaded, LAW begins
its TIFF conversion process by pushing the native files to the associated native application. The associated native application then prints to the virtual printer, and the virtual printer “prints” to a TIFF file which is then incorporated into the case. Figure 9.2.3 on the next page illustrates what this workflow looks like.
LAW’s support system relies on the third party application associated with the file type to process native ESI to TIFF. In some cases manual intervention is required because the printing process cannot be completely automated. The requirement of these proprietary applications also introduces another cost factor. For each LAW processing node, a license of the appropriate application suites must be purchased. We can think of this additional cost asPa
i=1cost(i)wherecost(i)is the license cost for applicationi. See figure 9.2.2. 1 365[$6,010 + (d c ge −1)∗$2,480 + (d c ge ∗$1,600) + Pa i=1cost(i)]
Figure 9.2.2: Ideal Scalability Including Third Party Application Costs
The use of TIFF as the format for pagination is problematic. Firstly, TIFF is an image container whereas the majority of content contained in ESI is textual. Second, TIFF files are large and cause an extreme increase in data set size for the final processed paginated form versus the original native ESI. Third, the conversion to TIFF requires additional steps to capture non-visible information. Finally, conversion to TIFF presents the problem of losing textual information especially text location information within the paginated image. This introduces the need to use OCR to recapture the textual information and its location within the paginated
corpus.