RAMAN AND ATR-FTIR SPECTROSCOPY TOWARDS CLASSIFICATION OF WET BLUE BOVINE LEATHER USING RATIOMETRIC AND CHEMOMETRIC ANALYSIS

There is a substantial loss of value in bovine leather every year due to a leather quality defect known as “looseness”. Data show that 7% of domestic hide production is affected to some degree, with a loss of $35 m in export returns. This investigation is devoted to gaining a better understanding of tight and loose wet blue leather based on vibrational spectroscopy observations of its structural variations caused by physical and chemical changes that also affect the tensile and tear strength. Several regions from the wet blue leather were selected for analysis. Samples of wet blue bovine leather were collected and studied in the sliced form using Raman spectroscopy (using 532 nm excitation laser) and Attenuated Total Reflectance - Fourier Transform InfraRed (ATR-FTIR) spectroscopy. The purpose of this study was to use ATR-FTIR and Raman spectra to classify distal axilla (DA) and official sampling position (OSP) leather samples and then employ univariate or multivariate analysis or both. For univariate analysis, the 1448 cm− 1 (CH2 deformation) band and the 1669 cm− 1 (Amide I) band were used for evaluating the lipid-to-protein ratio from OSP and DA Raman and IR spectra as indicators of leather quality. Curve-fitting by the sums-of-Gaussians method was used to calculate the peak area ratios of 1448 and 1669 cm− 1 band. The ratio values obtained for DA and OSP are 0.57 ± 0.099, 0.73 ± 0.063 for Raman and 0.40 ± 0.06 and 0.50 ± 0.09 for ATR-FTIR. The results provide significant insight into how these regions can be classified. Further, to identify the spectral changes in the secondary structures of collagen, the Amide I region (1600–1700 cm− 1) was investigated and curve-fitted-area ratios were calculated. The 1648:1681 cm− 1 (non-reducing: reducing collagen types) band area ratios were used for Raman and 1632:1650 cm− 1 (triple helix: α-like helix collagen) for IR. The ratios show a significant difference between the two classes. To support this qualitative analysis, logistic regression was performed on the univariate data to classify the samples quantitatively into one of the two groups. Accuracy for Raman data was 90% and for ATR-FTIR data 100%. Both Raman and ATR-FTIR complemented each other very well in differentiating the two groups. As a comparison, and to reconfirm the classification, multivariate analysis was performed using Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA). The results obtained indicate good classification between the two leather groups based on protein and lipid content. Principal component score 2 (PC2) distinguishes OSP and DA by symmetrically grouping samples at positive and negative extremes. The study demonstrates an excellent model for wider research on vibrational spectroscopy for early and rapid diagnosis of leather quality.


Introduction
Every year more than a billion animals are slaughtered as part of the animal production industry for meat. In turn, this generates returns of over a billion dollars for the global leather industry, meat processing's most important coproduct sector [1,2]. The production of leather is split into three phasesanimal slaughtering, tanning and manufacturing of the finished product for the commercial market. Tanning is one of the most important stages in the leather production. It involves processing the raw skin or hide, to retain its natural properties by stabilising the molecular structure and make it more durable [3]. Previously, natural chemicals like plant tannins, alum, and other minerals were used in the tanning process which had some advantages over current methods using synthetic chemicals, though these take only a fraction of the processing time required for the earlier methods [4]. Wet Blue refers to part-processed chrome-tanned leather in the wet state. During this stage the skin or hide is protected from decomposition through chemical crosslinking that stabilises the collagen network [5]. The blue colour comes from the chromium tanning agent (Chromium (III) oxide), which stays in the leather after tanning.
Looseness is a fault found in leather that affects the quality of the leather. It manifests itself as corrugations on the outer surface of finished leather when bent inward. Whilst processing is known to exaggerate the fault, the root cause of the less densely packed fibres in affected regions is poorly understood, although potential causes may include environment, nutrition, breed and age. Looseness is a major concern to the leather industry in terms of its effect on structure of the leather and appearance of the final leather product [6]. At present looseness can only be accurately identified once the leather is dried, thus tanners can only address it by either discarding the leather or remedial treatment, costing both time and money [7][8][9]. There is an understanding that looseness is prevalent in specific regions including the shoulder and flanks whereas other regions such as the backbone and official sampling position (OSP) are typically unaffected. This study investigates wet blue leather from two regionsdistal axilla (DA), i.e., from the flank side and official sampling position (OSP), i.e., from near the central lower region. The aim is to obtain a better understanding of how tight and loose wet blue leather might be differentiated through measurement, since the OSP and distal axilla regions typically give tighter or looser leathers, respectively. The intention of this study is to develop a model using nondestructive techniques that can identify the looseness fault at an early stage of the leather production. Different strategies or markets for the affected hides can then be identified, to not only save time but also to minimise the damaging costs incurred by identifying looseness at a later stage of processing.
Vibrational spectroscopy techniques such as Raman spectroscopy and Attenuated Total Reflectance -Fourier Transform InfraRed (ATR-FTIR), supported with ratiometric band intensity analysis and chemometric methods, are used here to identify structural variations which effect the physical properties of leather from the two identified regions -OSP and DA.
Raman spectroscopy measures the inelastic scattering of photons (with visible wavelengths) while they interact with the vibrational motions of molecules to provide useful information about molecular structure via both band position and intensity. Raman can be used for non-invasive probing of chemical and biological samples [10,11]. Infrared spectroscopy (IR) is based on the absorbance of infrared photons by molecules due to vibrational motion of the molecules present in the matrix. Both non-destructive techniques are fast, require minimal sample preparation, and have high specificity and sensitivity [12]. Raman spectroscopy has the advantage of a very weak water signal so minimal interference from water in biological samples [13], not causing any damage to the sample [14] and allowing insitu detection using optical fibres or microscopes. Raman is particularly sensitive to structures that are easily polarised, such as aromatic rings and sulphurcontaining groups. Water has an absorption that can mask the characteristic amide I band at 1640 cm − 1 and a very intense, broad absorption around 3300 cm − 1 , that can obscure absorption by other O-H and N-H vibrations. If water interference can be minimised, then the advantage of IR is its sensitivity to vibrations associated with the amide bonds in proteins. The secondary and tertiary structures of proteins influence the shape of the amide bands and IR spectroscopy provides useful information about protein structure. We have used an ATR-FTIR spectrometer that limits water interference by using a very short effective path length that results from the attenuated total reflection process.
For most studies of spectral diagnosis of biological samples, the mid-IR (MIR) spectrum within 4000-600 cm − 1 range seems to be more effective than the near-IR (NIR) range (14,000-4000 cm − 1 ). Bands within the range of 4000-1500 cm − 1 are characterized by various stretching modes of functional groups of molecules. Bands below 1500 cm − 1 are dominated by deformation, bending and ring vibrations of the molecular "backbone", and are generally referred to as the fingerprint region of the spectrum. As the vibrational activity between Raman and infrared (IR) spectroscopies is different, some modes in both are active, but others are only Raman or IR active. MIR and Raman spectra both exhibit amide bands that are relevant to the structure of collagen. Thus, IR and Raman spectroscopies provide similar and complementary information of molecular vibrations [15][16][17].
Sample preparation is relatively simple as compared to other analytical techniques, such as high-performance liquid chromatography (HPLC) and colorimetric methods [18][19][20][21][22]. Finding the variations in the initial leather processing steps reduces the costs of down-stream processing. Therefore, the label-free and non-destructive techniques are highly attractive tool for understanding wet blue leather [ 13,16,17].
Several bone studies have utilised Raman and IR spectroscopy to identify defects [5,23,24], the quality of bone affected by bacteria [25][26][27], or changes in collagen due to cross-links [20-22, 28, 29], but no work has so far been performed on wet blue leather defects. To the best of our knowledge, this work is the first attempt to identify the variation between loose and tight leather regions using these two techniques.

Sample preparation
All bovine wet blue samples were prepared by New Zealand Leather and Shoe Research Association (LASRA®) using the conventional methods [19]. Samples were collected from the official sampling position (OSP) and distal axilla (DA) of the wet-blue and stored at below 4°C until analysis.
Wet blue samples were sliced using a Leica CM1850UV Cryostat to 40 μm thickness. Six replicates of each sample were cut and placed on a microscope slide for Raman and ATR-FTIR analysis as described below.

Data acquisition and spectral processing
Six leather samples labelled as 'DA' and displaying signs of looseness and five wet blue bovine leather samples labelled as 'OSP' from the tighter regions of the hide were prepared for analysis using the method described above. These samples were then analysed using a home-built Raman microscope utilising a Teledyne-Princeton Instruments (USA), FERGIE spectrometer using a 532 nm excitation laser (with~10 mW laser power) focused onto the sample with a spot size diameter of~1-2 μm using 40 × magnification and 0.65 NA objective. For both Raman and IR measurements, spectra were collected from 5 different spots on each sample. Raman spectra were acquired with an exposure time of 5 s per frame and 10 frames (each frame was stored separately). Therefore, 50 spectra were obtained from each DA and OSP sample.
A Thermo Scientific™ iD5 Nicolet™ iS™5 Attenuated Total Reflectance -Fourier Transform InfraRed (ATR-FTIR) spectrometer was used to collect ATR-FTIR spectra from the same wet blue samples. Spectra were recorded by attenuated total reflection (ATR) on a diamond crystal and 16 scans were collected from 5 different spots for each sample. Figure 1 shows the flowchart for spectral analysis. For analysis by principal components analysis, each spectrum was preprocessed with an algorithm written using the SciKit Learn package [30] in Python 3.7. Baseline correction, background subtraction and average spectra were obtained using the Python algorithm.
For ratiometric analysis, Origin 2018b (Origin Lab Corporation, Northampton, MA, USA) was used. Preprocessing, consisting of a 7-point, zero-order derivative Savitzky-Golay smoothing function, was applied to smooth spectral noise. Curve fitting by sums-of-Gaussians was used to determine band areas, which were subsequently used to calculate area ratios of the peaks of interest.

Results and discussion
Raman and ATR-FTIR spectra are shown in Figs. 2 and 3 respectively. Bands that are known to be associated with functional groups and structures in protein are labelled in the Raman spectra. Bands with positions within instrumental resolution in the Raman and FTIR spectra are assumed to have the same chemical and structural origin [31].

Peaks of interest
Band assignments and their interpretation are based on Raman and IR studies of collagen tissues. 16,17,32. By visual inspection, it was found that there are variations in the collagen region (1002-1680 cm − 1 ). A careful examination of the spectra showed shifting of a few peaks due to the complexity of biochemical components in leather samples. Sharp peaks were observed in OSP leather Raman spectra whereas significant overlapping of bands was found in DA in the 1550-1700 cm − 1 region.
The observed peak positions of the Raman and IR bands observed and their assignment for wet blue leather are shown in Table 1.
There is a significant shift observed in peak position, intensity and number of signature peaks between DA and OSP samples in Raman and IR spectra. Both DA4 and DA5 show a broad band different from the other DA replicates, which indicates that some structural changes in collagen may occur due to alterations in secondary structures -α helix, β sheet, random coils or immature cross links.
The peak identified at 1669 cm − 1 is associated with random or unordered protein structure (e.g., random coils). The amide I vibration is dominated by peptide carbonyl stretching vibration with some contribution of C-N stretching and N-H in-plane bending [32]. The bands near 875 and 920 cm − 1 can be assigned to the C-C stretching vibrations of amino acids characteristic of collagen; hydroxyproline and proline. The band near 1002 cm − 1 is assigned to the phenyl ring breathing mode of amino acid, phenylalanine [15,28,33]. 2340 cm − 1 band is observed in few DA and OSP samples is the appearance of asymmetric stretching of CO 2 band which is the result of some background from the spectra.
Special emphasis was placed on the spectral features at 1448 cm − 1 which is assigned to the CH 2 bend of phospholipids [34] and 1650-1669 cm − 1 , that corresponds to amide I region that is comprised of both proteins and lipids [35,36]. Selecting these two bands serves as an excellent indicator of variations, because any changes due to lipid variation are factored out using the 1448 cm − 1 lipid band [37,38]. It was found that 1448 cm − 1 is the more intense Raman band when compared to 1669 cm − 1 whereas 1632 cm − 1 is the most intense IR band. Therefore, collagen analysis was performed using the peak area ratio of the CH 2 wag band at 1448 cm − 1 and the amide I band at 1669 cm − 1 for Raman analysis.
Before analysing the Raman and IR marker bands for ratiometric analysis, we decided to validate the accuracy of the method and spectral positions identified for DA and OSP. Raman analysis was performed on the other regions of the wet blue to classify loose and tight features based on specific biomarker Raman bands.
Our hypothesis is that OSP region tend to give tight leather as it comes from central backbone part of hide whereas DA region is more prone to looseness as it comes from flanks or sides of hide. To confirm that the features we have identified are characteristic of looseness and not simply associated with the location of the sample, we have a selected few regions from OSP looking at some wrinkles which exhibit characteristic looseness and a few regions from DA which are not wrinkled or stretched to investigate for tightness.
We have also selected regions from other parts of wet blue like neck and shoulder. The selection of samples was based on the visual examination of wet blue. Figure 4 shows the Raman spectra of all these regions labelled as OSP (tight); the characteristic feature of OSP, OSP (loose) means few loose regions in OSP, DA (loose); the characteristic feature of DA and DA (tight) meaning any tight regions observed in DA. Table 2 summarises some differentiating Raman bands from OSP, DA and other regions of wet blue leather. A few characteristic bands categorise wet blue into two sets as loose and tight leather rather than OSP and DA. The following two signals are of interest in classifying looseness: (1) the protein backbone confirmation: the amide I band detected at 1677 cm − 1 for tight regions in OSP and DA corresponds to anti-parallel β-sheets which give a tight structure whereas the amide I  But the loose and tight features are more dominant in DA and OSP regions and visible in the spectra (Figs. 2,3, and 4), despite of looking specifically loose regions in tight section of OSP or tight regions in loose section of DA, therefore has the potential of further classification based on location. Hence, further study was carried out in analysing DA and OSP locations for classifying loose and tight structural features to demonstrate the potential of Raman and ATR-FTIR spectroscopy.
Two spectral analysis techniques were employed to find the best classification fit for various biochemical components affecting the wet blue leather quality, such as proteins, lipids or nucleic acids.
A univariate statistical method which includes band intensities, area ratios and intensity ratio calculations for the interpretation of spectra [42,43]. This ratiometric analysis was carried out for qualitative classification, which was further supported by a logistic regression algorithm to enable straightforward quantitative classification of DA and OSP.
A multivariate statistical method (based on Principal Component Analysis) that considers the whole spectrum but performs classification with a small number of variables (data set dimension reduction) that extract the maximum variance in the data. Multivariate analysis makes no a priori assumptions about selecting the best variables for classification.

Univariate analysis
Ratiometric analysis, a simple approach, was employed to identify the spectral variations by Raman and IR spectroscopy and generate a systematic and comparative trend of structural features of biochemical components in OSP and DA. Ratiometric analysis can overcome variations due to sample thickness and morphology, background scattering fluctuations and other instrumental effects [17].
Intensity-based ratiometric analysis may result in inaccurate interpretation due to baseline estimation issues [35,44]. So, average and standard deviations of the peak area ratios for the CH 2 deformation (1448 cm − 1 ) and Amide I (1669 cm − 1 ) bands from Figs. 2 and 3 were calculated [16,17]. For the Raman spectra, 0.57 ± 0.099 and 0.73 ± 0.063 for DA and OSP ratio values were obtained respectively. For IR, the values were 0.40 ± 0.057 and 0.49 ± 0.13 for DA and OSP samples (Additional file 1). Both Raman and IR show significant variation between the two categories of loose and tight samples. Although ATR-FTIR and Raman spectroscopy arise from the same physical phenomenon of molecular vibration, the processes of Raman scattering, and infrared absorption are fundamentally different as observed in Table 1. Additional bands like 1548 cm − 1 , observed in IR, but absent in Raman provides an understanding of cross-links in collagen [44,45]. Hence, the combination of Raman and FT-IR gives synergistic information on complex samples in a non-destructive manner.
The variation between the two categories could be the result of changes in the collagen network, which directs further investigation towards the amide I band of collagen, which consists of several secondary structures [24]. Curve fitting by sums-of-Gaussians method was used to find the component area under a broad band. Accurate peak areas, and peak centres then can be deduced. Univariate analysis was again performed using the collagen components in the section below.

Alterations in collagen network
The amide regions of proteins are overlapped by many underlying bands [27]. In vibrational spectroscopic methods, such as FTIR and Raman, resolution of underlying constituent peaks and calculation of their contributions offer a wealth of information, as these peaks are very sensitive to secondary structure [15,46]. Therefore, curve-fitting was carried out on both Raman and IR data to investigate the spectral changes in the secondary structures of collagen. A typical result of curve fitting four Gaussian components to the Amide I band in the Raman spectrum is shown in Fig. 5. These secondary bands have been used to investigate the lipid to protein ratio as a measure of collagen quality.
There is a well-established frequencyassignment correlation in literature (Table 3) for the underlying bands in amide I group [34,43]. The Amide I band in the OSP spectrum is strongly asymmetric and its curvefitting (Fig. 5) yields components in the 1600-1700 cm − 1 region which can be mainly assigned to collagen (1648 and 1669 cm − 1 ), elastin (1681 cm − 1 ), and amino acids (1610 and 1698 cm − 1 ).
The triple helical structure of the collagen molecule is unique and there is no specific peak wavenumber for these secondary structures (e.g., α-helix, or β -sheet). Therefore, changes in collagen's helical structure were investigated empirically by observing changes in the curve fitted area ratios, as identified by the curve-fitting analysis.   Collagen crosslinking is measured as changes in the amide I envelope [39,46]. It was observed that the Raman band at~1669 cm − 1 was present in the fractions containing the trivalent collagen cross-links whereas IR observed a band at 1632 cm − 1 , but no band was evident at~1669 cm − 1 [17,27]. From literature studies [47,48], biochemical analysis of collagen peptides showed that pyridinoline (Pyr) crosslinks result in a band at 1666 cm − 1 . Therefore, the peak at 1669 cm − 1 reflects pyridinoline cross-linked collagen peptides [29]. These observations from Raman and IR spectra provide additional information of changes in the amide I band. Most of the underlying bands of amide I arise from the structure of the collagen triple helix as well as the telopeptides (1632, 1645, 1655, 1672, and 1682 cm − 1 ). The intermolecular crosslinking of collagen is a key element in determining tensile strength and elasticity [49,50].
For amide I, the Raman band area ratio of 1648/1681 cm − 1 (non-reducible and reducible collagen types) was used for analyzing variations between loose and tight leathers, whereas, for IR, the 1632/1650 cm − 1 (triple helix and α-like helix collagen types) ratio was used, as shown in Fig. 6 (Additional file 1).
For quality assessment, a student t-test was carried out between the two ratio datasets. For Raman, the t-test gave a value of p = 0. 0008, and for IR, p = 6.8 × 10 − 5 . So, there are significant differences (p < 0.05) between the DA and OSP ratios, and, therefore, suitable to fit to a regression model.

Logistic regression
Quantitative classification of DA and OSP involves a continuous independent variable (peak area ratio) and a binary dependent variable (DA vs OSP), therefore a logistic regression (LR) algorithm [37] was devised to discriminate the samples using the SciKit Learn package [30] in Python 3.7.
A confusion matrix was generated from the output that describes the performance of classification. It summarises correct and incorrect spectra classification. It is useful for two-class classification and in measuring recall, precision and accuracy [18,51,52]. The confusion matrix for Raman and IR data obtained is presented in Tables 4 and 5.
The first entry in the confusion matrix is the number of correctly identified DA samples. i.e. 5/ 5 which is a perfect classification whereas for OSP it is 4/5 which is also close to perfect fit. Accuracy, precision and recall are of importance where: where TP = true positive, TN = true negative, FP = false positive, and FN = false negative with DA arbitrarily set as True and OSP set as False.
The accuracy for the Raman data is 0.9 (90%), precision is 1.0 (100%), and the recall score is 0.8 (80%). The IR data presented in Table 4 shows the perfect classification of 6/6 from all six DA and OSP. This means all were correctly classified. The accuracy, precision and recall score is 1.0 (100%).
From the results obtained, it is evident that Raman peak area ratios and IR peak ratios are a good predictor in differentiating the leather type. Both techniques complement each other very well. There is a significant difference obtained in the recall score for Raman and IR data that provides the motivation for the multivariate analysis of the Raman data. Although univariate analysis is quite useful, it might be possible to still obtain a useful prediction from Raman spectra by using a multivariate analysis to reveal the differences, especially when there is a large dataset.

Multivariate analysis
Multivariate analysis can be used to quickly characterise the "types" or "classes" of spectra or samples present in a large data set. An unsupervised method, Principal Components Analysis, is used that can determine the existence of classes in the data set without any assumptions of the number of classes. The classes are determined by transforming the data set, expressed in the original spectral variables, to a new description using variables (principal components) that maximise the separation between samples (the principal components are the eigenvectors of the variance-covariance matrix). A scores plot shows the samples plotted using the principal components. If distinct clusters of samples are observed in the scores plot, then classes exist in the data set. All spectral variables in the original data set have been used in the analysis presented here [27].
There is a supervised method, linear discriminant analysis, that assumes the existence of classes and then proceeds to constructs a function (the discriminant) that gives the best separation between the classes. LDA works on a similar approach to PCA, but LDA creates a linear function (the discriminator) that maximises the differences among the classes or groups [44]. It will show how well the classes are separated as well as where the classification fit is robust and where it is misinterpreted. To demonstrate the best performance of classification in a robust model, combinations of both PCA and LDA were attempted. A potential issue with LDA is that it will always sort samples into classes, so it is difficult to determine if the model contains errors. However, performing PCA prior to LDA can independently confirm the existence of classes in the data set. The principal components from the PCA analysis can also be used to construct the discriminant function in LDA (PCA-LDA).    Figure 7 below shows the loading plots of first three principal components. PC1 explains 59.1% of the data while PC2 and PC3 explain 16.2% and 11.7%, respectively.
The loading plots shown in Fig. 7 indicate which spectral bands contribute most to the variance described by the principal component. The OSP average spectrum is used as a reference for comparison of loadings. This gives an understanding of the origin of differences between the samples corresponding to spectral variations. The strong contribution in PC1 and PC2 is from the C= O stretch around 1669 cm − 1 which is usually from amide I band and is mainly proteins and lipids. PC2 has spectral contributions from the Amide III band, around 1243 cm − 1 , which is purely collagen and from the CH 2 wag with a broad noisy band around 1340 cm − 1 which indicated collagen and lipids. PC3 shows another significant contribution around 1100 cm − 1 that is broadly from C-O-C modes, which is mainly protein [33,50].. So, the loadings are showing that DA and OSP samples are differentiated due to their protein and lipid content. These observations are consistent with the identification of the tight and loose marker bands as discussed in Fig. 4. It appears that these bands are responsible for the classification of the samples using Raman and there was a distinguished difference between the two leather types. Figure 8(a-c) shows the two-dimensional (2-D) score plots, with 95% confidence ellipse, of combination of two principal components with the aim to find the separation between DA and OSP. The screeplot in Fig. 8d shows the proportion of the percentage of variance that is accounted for by the principal components.
DA and OSP were not separable as clusters within principal components, but OSP samples show a significant separation along PC3 and DA samples are on the positive side of PC2. No perfect distinction is found between the replicates of OSP and DA in other dimensions. A few points of OSP and DA samples in Fig. 8b overlap with each other. PCA scores plots reveal the intra and intergroup variation between loose and tight wet blue samples [27,46]. After comparing the sample replicates at individual level, principal component analysis was performed on the average spectra of loose and tight replicates. An interesting observation shown in Fig. 9 is that OSP and DA are symmetrically grouped at the positive and negative extremes which is contributed by PC2, which is 7.8% and discriminates both, respectively. This reduced dataset formed after averaging six samples of OSP and DA makes the differentiation more identifiable and provides a significant role in visualising it clearly.
By displaying the data along the directions of maximal variance, PCA analysis demonstrates that Raman spectra of Wet Blue can be separated into classes (i.e. loose and tight). However, the principal components might not give the maximum separation between the classes. Linear discriminant analysis was used to construct a function (the discriminant function) that maximised the class separability. LDA assumed that the data was Gaussian distributed, that all rows must belong to one group (samples are mutually exclusive) and that the variances are the same for both groups. The original variables, or the principal components can be used to construct the discriminant function, the principal components are used as they have the advantage of being independent. If two principal components are used (so the scores plot is planar), the LDA process finds the line in the scores plot plane that maximises separability.
When LDA is done on the PC scores, the mean centre of each grouping is calculated, and each spectrum is predicted to belong to one of the groups based on its distance from the centre of the group. The accuracy of the prediction is an indication of how well the groups are separated [15,23].
A leave-one-out cross validation method [53] was utilised to train the LDA classifier where one sample is left out of the calibration model and predicted with the results obtained. Results are plotted in Fig. 10, which shows the observed group with the predicted group along with the cross-validation summary ( Table 6). It also shows that one of the six DA samples was falsely identified as OSP and two of the five OSP samples were   falsely identified as DA. The error rate for cross validation of the training data is 12.33%. The Wilk's Lambda test was conducted on the discriminant variable and found that the discriminant function is highly significant (p < 0.05) in agreement with the classification summary.
The cross-validation summary table shows that OSP has a classification accuracy of 60% and DA has 83.33% which proves that both are mutually exclusive as was expected.

Conclusion
In summary, the work presented here used Raman and Infrared spectroscopy to investigate the variations in loose and tight wet blue leather. To the best of our knowledge, this is the first study done in depth using ratiometric and chemometric analysis to identify and quantify the difference between two wet blue samples of OSP and DA. Vibrational spectroscopy with advanced spectral analysis can quantify the biomolecules which impact the quality, strength and sustainability of leather.
Classification from the peak area ratios was done using logistic regression that gives 100% accuracy for IR data and 90% accuracy for Raman data. Multivariate analysis has supported the Raman results for OSP and DA in describing the difference between the groups providing a clear representation of underlying biological differences. This study is a proof of principle to employ vibrational spectroscopy for quality assessment of leather.
Identification of issues at the raw skin stage and differentiating the changes occurring at each stage of leather processing will be the next area of further work so that only high-quality leather can be obtained with no defects. The analysis of Raman spectra in this work classifies leather samples based mostly on their chemical composition as this factor has the strong influence on the shape of the Raman spectra. Structural factors are also likely to be important. Polarised Raman microscopy can provide structural information that complements the chemical information acquired from the spectral data alone. Further research will analyse the Amide I and Amide III bands using polarised Raman microscopy to provide information on cross-linking between microstructures in the samples.
Additional file 1. Supporting Information.