Data mining applications in modeling Transshipment delays of Cargo ships by ijcsiseditor


More Info
									                                                             (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                                       Vol. 9,No.10, 2011

                   Data mining applications in modeling
                   Transshipment delays of Cargo ships
                  P. Oliver jayaprakash                                                      Dr.K.Gunasekaran,
       Ph.D student, Division of Transportation engg,                      Associate Professor, Division of Transportation engg.
        Dept.of Civil engineering, Anna University,                            Dept.of Civil engineering, Anna University,
                Chennai, Tamilnadu, India                                               Chennai, Tamilnadu, India
                                                     Professor, Dept.of EEE
                                                Mepco schlenk engineering college,
                                                   Sivakasi, Tamilnadu, India.

Abstract—The Data mining methods have a plenty of applications            processing the Non-containerized ships related transhipment
in various fields of engineering. The present application area is         delays and model it using various models such as MLR, NLR
the Port operations and management. Conventionally port                   and ANN. A ship’s service time, which affects quantum of
performance was assessed by the ship turnaround time, a marker            the consignments imported and exported in a particular time
of cargo handling efficiency. It is a time used up at port for
                                                                          period, was much influenced by berth planning and allocation.
transshipment of cargo and servicing. During the transshipment
and servicing, delays were inevitable and occur predominantly;            Also, it affects the Ship turnaround time, since the vessels’
The major delay happening at port was due to the non-                     length of stay at port was decided by it. The delay caused by
availability of trucks for evacuation of cargo from port wharf to         shunt trucks at port gates was one of the crucial issues faced
the warehouses. Hence, modeling the delay occurrences in port             by the Port authorities. The cargo evacuation period was
operations had to be done, so as to control the ship’s turnaround         influenced by shunt trucks turnaround time. The turnaround
time at the port to prevent additional demurrage charges. The             time of a truck was estimated as the time taken to evacuate the
objective of this paper was to study the variety of delays caused         cargo completely from the port’s quay or wharf to the
during the port processes and to model it using Data mining               company warehouses located in the port outer area. Port
                                                                          terminals trying to minimise the truck turnaround time, so as
   Keywordst; Data mining techniques, Transshipment delays,               to reduce the inland transportation cost of cargo evacuation.
Shunt trucks, Artificial neural network, Nonlinear analysis.              The delay component was significant, varying and high in
                                                                          developing countries compared to the efficient ports of
                      I.    INTRODUCTION                                  developed countries.
                                                                                    The export or import of commodity was done by the
The growing volume of Port related transhipment data raises               procedures of port system given in the Figure 1. The major
many challenges, one is to extract, store, organize, and use the          factors affecting the ship servicing delay were lengthy port
relevant knowledge generated from those data sets. The data               operational procedures in importing or exporting the cargo,
content with differing time periods could be deployed for                 ship related delays (not related to port) and port related delays
various engineering applications. The innovations that occur in           and delays due to carriers. Hence, it was necessary to analyse
computing infrastructure and the emergence of data mining                 the causes behind delays and to formulate strategies to
tools have an impact on decision making related port shipment             minimise it.
operations. The growing demand for data mining has led to the
development of many algorithms that extract knowledge and
features such as missing data values, correlation, trend and
pattern, etc. from a large scale databases. Data mining
techniques play a crucial role in several fields of engineering
applications. They help the managers in formatting the data
collected over an issue and collecting the potential information
out of the data through preprocessing and warehousing tools.
The conventional MLR models were replaced by Nonlinear
and ANN models to do the prediction of future variable values
related to the complex systems, even with the minimum data                       Figure 1 Operations in Non-containerised cargo
because of their accuracy and reliability in results. This paper
focus on the application of data mining techniques in

                                                                                                       ISSN 1947-5500
                                                            (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                                      Vol. 9,No.10, 2011
 The list of procedures related to truck shunt operations to             the sub systems such as landside transport, storage of
   evacuate the cargo is given below;                                    containers in a marine container terminal. Brian M. Lewis,
                                                                         Alan L. Erera, and Chelsea C. White [14] designed
  Procedures involved in transshipment operations                        a markov process based decision model to help stakeholders
  •     Prepare transit clearance                                        quantify the productivity impacts of temporary closures of a
  •     Inland transportation                                            terminal. He demonstrated the uses of decision trees to gain
  •     Transport waiting for pickup &loading                            insight into their operations instead of exhaustive data
  •     Wait at port entry                                               analysis. Rajeev namboothiri [15] studied the fleet operations
  •     Wait at berth                                                    management of drayage trucks in a port. Truck congestion at
  •     Terminal handling activities                                     ports may lead to serious inefficiencies in drayage operations.
                                                                         H.Murat Celik [16] developed three different ANN models for
                  II.   PAST LITERATURE                                  freight distribution of short term inter-regional commodity
   Ravikumar [1] compared the various data masking                       flows among 48 continental states of US, utilizing 1993
techniques such as encryption, shuffling, scrubling, etc and its         commodity survey data. Peter B. Marlow [17] proposed a new
wide applications in various industries to secure data from              concept of agile ports, to measure the port performance by
hacking and discussed the advantages of Random                           including quantitative and qualitative parameters. Rahim F.
Replacement as one of the standard method for data masking               Benekohal, Yoassry M. El-Zohairy, and Stanley Wang [18]
with the highest order of security. Mohammad behrouzian [2]              evaluated the effectiveness of an automated bypass system in
discussed the advantages, limitations and applications of data           minimizing the traffic congestion with the use of automatic
mining in various industries and the banking industry,                   vehicle identification and Low speed weight in motion around
especially in the customer relationship management.                      a weigh station in Illinois to facilitate preclearance for trucks
According to Krishnamurthy [3] data mining is an interface               at the weigh station. Jose L. Tongzon [19] built a port
among the broad disciplines like statistics, computer science            performance model to predict efficiency of transshipment
and artificial intelligence, machine learning and data base              operations. This present research focus on Bulk ports handling
management,etc., Kusiak [4] introduced the concepts of                   Non-containerized cargo ships. The transshipment delay data
machine learning and data mining and presented the case                  was used for building a predictive model for the future ship
studies of      its applications in industrial, medical, and             delays.
                                                                                                          TABLE I
pharmaceutical domains.                                                                 SUMMARY OF TRANSHIPMENT DELAY DATA
    Chang Qian Gua [5] discussed the gate capacity of
container terminals and built a multiserver queuing model to                     Variable       Mean            S.D         Min.        Max.
quantify and optimize the truck delays. Wenjuan Zhao and
Anne V. Good child [6] quantified the benefits of truck                            X1            102            55           34          504
information that can significantly improve crane productivity
                                                                                                 0.88          0.36         0.26        1.74
and reduce truck delay for those terminals operating with                          X2
intensive container stacking. Unctad report [7] suggests                           X3            0.03          0.04         0.00        0.08
various port efficiency parameters to rank the berth
                                                                                   X4            0.28          0.12         0.05        0.72
productivity. The parameters used were, average ship berth
output, delays at berth, duration of waiting for berth and turn-                   X5           27.00          25.00        5.00        80.00
round time. Nathan Huynh [8] developed a methodology for
                                                                                   X6            2.35          1.44        0.33         5.78
examining the sources of delay of dray trucks at container
terminals and offered specialized solutions using decision                         X7            0.04          0.03         0.01        0.18
trees, a data mining technique. U. Bugaric [9] developed a                                      0.038          0.026        0.01        0.18
simulation model to optimize the capacity of the Bulk cargo
                                                                                   Y             0.18          0.09         0.00        0.35
river terminals by reducing transshipment delay, without
investing on capital costs. Mohammed ali [10] simulated the              Where,
critical conditions, when ships were delayed at offshore and             Y = Transshipment delay of Non-containerized cargo.
                                                                          X1=Number of evacuation trucks,X2=Truck travel time,X3=Gang nonworking
containers were shifted to port by barges; Kasypi mokhtar [11]
                                                                         time,X4=Truck shunting duration,X5=Trip distance ,X6=Berth Time at
built a regression model for vessel turnaround time                      berths,X7=Waiting time at berth,X8= other miscellaneous delays .
considering the Transshipment delays and number of gangs
employed per shift, etc. Simeon Djankov [12] segregated the
pre-shipment activities such as inspection and technical                                III.   DATA COLLECTION & ANALYSIS
clearance; inland carriage and handling; terminal handling,              The noncontainerised cargo ship data were collected for the
including storage, Customs and technical control. And, he                past five years from 2004 to 2009 from various sources
conducted an opinion survey to estimate the delay caused in              including India seaports [20, 21&22] for a study port. The data
document clearance, fees payment and approval processes.                 comprised of number of ship cranes, number of trucks
F. Soriguera, D. Espinet, F. Robuste [13] optimized the                  required to evacuate, crane productivity, truck travel time,
internal transport cycle using an algorithm, by investigating            idle time, gang idle time, truck shunt time, truck trip distance,

                                                                                                        ISSN 1947-5500
                                                            (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                                      Vol. 9,No.10, 2011
delay caused at berth and the gross delay, ship waiting time for         A Artificial neural network modeling
berth outside the channel, time spent on berth (berthing time)               An artificial neural network was an emulation of biological
and ship turnaround time. The summary of ship delay data and             neural system which could learn and calibrate itself. It was
the methodology of the study were presented in Table 1 &                 developed with a systematic step-by-step procedure to
Figure 2.                                                                optimize a criterion, the learning rule. The input data and
                                                                         output training was fundamental for these networks to get an
A. preprocessing,Correlation and Trend
                                                                         optimized output. The neural network was good at studying
The collected data was preprocessed using data transformation            patterns among the input data and learns. The prediction
algorithm and the missing values in the database were filled             accuracy increases with the number of learning cycles and
and the descriptive statistics was estimated. The average crane          iterations. The estimation of Gross transhipment delay caused
working time was 5.93 hours per day and mean gang idle time              to the commodity ship tends to vary with type of cargo,
was 0.03 days. The mean berthing time was 2.3 days and the               season, shipment size and other miscellaneous factors, the
mean ship turnaround time was 2.71 days. A multivariate                  most popular and accurate prediction technique;
analysis was done to estimate the correlation among dependent            MATLAB’s Back propagation neural network (BPNN)
and independent variables. The correlation matrix showing the            module was utilized to predict the Transhipment delay faced
correlation among the variables was presented in Table II. The           by non-containerised ships from the past data. Figure 3 present
average Crane efficiency at the study port was 19616 Tonnes              the hidden layer and architecture of BPNN. The ANN based
per day; average ship waiting time at berth was 0.04 day and             model was built and training was done using three years’ of
the mean crane productivity was 7.67 Tonnes per hour. The                past data and for testing & production, the two years data were
average number of trucks required for evacuation was 104; the            used. The inputs, fleet strength of evacuation trucks, truck
mean truck travel time was 0.88 hour mean delay caused to the            travel time, delay due to gang -workforce, idle time, shunting
ship at the port was 0.18 day.                                           time, trip distance, berth time, delay at storage area were given
     To study the relationship between the independent                   as batch files and the script programming was used to run
variables and dependant variable, correlation analysis was               neural network model with adequate hidden neurons and the
carried out and the results were presented in Table II. The                                       TABLE II
independent variable, transshipment delay is highly correlated                      CORRELATION VALUES BETWEEN VARIABLES
with Delay caused at storage area and by gang /workforce and
further it was correlated with the ship berthing time at port.                         X1        X2           X3          X4          X5
Also, it was significantly correlated to the number of                       X1       1.00     -0.98         -0.35       -0.50      -0.18
evacuation trucks, travel time of truck and trip distance, etc.
                                                                             X2       -0.98     1.00         0.37        0.53        0.17
                                                                             X3       -0.35     0.37         1.00        0.25        0.11
                                                                             X4       -0.50     0.53         0.25        1.00        0.08
                                                                             X5       -0.18     0.17         0.11        0.08        1.00
                                                                             X6       0.07     -0.05         -0.03       -0.52       0.01
                                                                             X7       0.13     -0.11         -0.05       -0.06      -0.02
                                                                             X8       0.00     -0.02         -0.03       -0.01       0.03
                                                                             Y        -0.21     0.22         0.54        -0.04       0.15
                                                                                                 X6           X7          X8          Y
                                                                             X1                 0.07         0.13        0.00       -0.21
                                                                             X2                -0.05         -0.11       -0.02       0.22
           Figure 2 Methodology of the study                                 X3                -0.03         -0.05       -0.03       0.54
                                                                             X4                -0.52         -0.06       -0.01      -0.04
                                                                             X5                 0.01         -0.02       0.03        0.15
                                                                             X6                 1.00         0.17        -0.37       0.02
Using the historical data on Transhipment delay collected, an
ANN model was built, to study the relationship between                       X7                 0.17         1.00        -0.34      -0.19
Transhipment delay and other influencing parameters. Also, a                 X8                -0.37         -0.34       1.00        0.48
MLR model and a multivariate nonlinear regression model                      Y                  0.50         0.20        0.60        1.00
were built for the above data and statistical performance
and prediction accuracy of models were compared and the                  output, transshipment delay was generated and compared with
outcomes were presented.                                                 the MLR and Nonlinear regression model outputs.
                                                                         The ANN sample statistics (training, testing and production)
                                                                         were given in Table III. The Table IV presents the ANN

                                                                                                      ISSN 1947-5500
                                                                 (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                                           Vol. 9,No.10, 2011
output statistics. The error in prediction was significantly low
(0.006 to 0.015). The correlation coefficient was 0.93.                   Multiple linear regression models for Gross transshipment
                                                                          delay of Noncontainerised cargo ships;

                                                                          Y=      0   .108+     3.47*10-05*X1+   4.953*10-02*X2+0.942*X3-1.988*10-
                                                                              X4+1.662*10-04*X5+4.397*10-04*X6+2.462*10-02*X7+1.006*X8         (1)

                                                                          Where, X1=Number of evacuation trucks;X2=Truck travel time;
                                                                          X3=Gang      nonworking     time;X4=Truck     shunting     duration;X5=Trip
                                                                          distance;X6=Berth Time at berths;X7=Waiting time at berth;
                                                                          X8= other miscellaneous delays; Y=Transhipment delay.

                                                                          C. Multivariate Nonlinear regression analysis:
                                                                                Multivariate Nonlinear regression analysis was
                                                                          performed to build a model between independent and
          Figure 3 Hidden layer & Architecture of BPNN
                                                                          dependant variables to estimate the Gross transshipment delay
                              TABLE III
                                                                          caused to the noncontainerized category of ships. The effect of
                                                                          dynamics of independent variables over the dependant
                                                                          variables was brought in by the nonlinear analysis. The
     Cargo          Sample      Samples   Samples                         estimated MNLR model was given in eq.(2).
                     s for        for       for          Total            Nonlinear regression model:
                    Traini      Testing    Prodn          No.
                    ng No.        No.       No.
                                                                           Y = [(-9.435E-02)-(1.806E-02)*(1/SQRT(truck_Tt))+(4.51231E-03)
  Non-               1243
                                  638       1339                          *(1/SQRT(truck_Tt))^2+(12.41806)*(V)-(0.949)*( U)*(V)+(7.95E-02)*(V)^2
  containerised      (38.6                               3221
                                (19.9%)   (41.6%)                         +(0.127)*(W)+(4.675489E-02)*( U)*(W)-(25.03726)*(V)*(W)+(1.599472E-02)
                                                                          *( W)^2+(4.856763E-02)*( X)-(0.0139986)*(U))*( X )+(1.352323)*(V)*( X)
                                                                          -(1.153036E-02)*( W)*(X)-(2.087984E-03)*( X)^2)] / [(1+(6.954577)*( U)*(V)
                            TABLE IV                                      +(0.3523445)*( U))*( W)-(120.3657)*(V)*(W)-(8.882952E-02)*( U))*( X )
                                                                          +(10.20601)*(V)*( X)+(7.149175E-03)*( W)*(X))]                        (2)
                                                                          Where,Y = Gross transshipment delay; U = 1/√ (truck trip time);
       ANN output parameters              Value                           V = (Gang idle period)2 ;W = 1/√(truck shunting time); X = Log (craneff_ton);

              R squared:                  0.87                                                V RESULTS & DISCUSSIONS
               r squared:                 0.87                            The actual service time values (observed) were plotted against
         Mean squared error:              0.001                           artificial neural network model and MLR, MNLR forecasted
                                                                          outputs for Non-containerised cargo and presented in Figure 4.
         Mean absolute error:             0.01

        Correlation coefficient :         0.93

                           Table V
      Performance of MLR & Multivariate nonlinear
                     regression analysis
                              MLR        MNLR
     Output parameters
                              analysis   analysis
    RMS Error                 8.40E-03   7.87E-02
    R-Squared                 0.90       0.35
  Coefficient of Variation    3.90E-02   3.93E-03
    Press R-Squared           0.89       0.34                                    Figure 4 Observed ,MLR & MNLR ANN forecasted values

B. Multiple linear regression Models                                            A sensitivity analysis was carried out to study the
                                                                          influence of port characteristics on Delays using the proposed
      The multiple linear regression analysis was used to build           models. The gross delay was directly proportional to the crane
a model between independent and dependant variables to                    efficiency and truck shunting time. As the crane efficiency
estimate the Gross transshipment delay caused to the                      increase from 2000 T to 12000 T the delay might increase
noncontainerized ship at Port (including delay at berth and               from 0.20 days to 0.366 days.The delay become optimised for
other delays due to gang, crane and other parameters). From               the range of 55 to 75 shunting trucks. Also,the crane efficiency
the multivariate correlation analysis, the correlations between           varies with the shunt trucks efficiency in transhipment. The
the variables were found. The variables with a significant                effect got influenced by level of service or congestion levels
relationship have been chosen for MLR model building. The                 of roads. The gross delay got affected due to port berth
variables selected for model building were given below:                   delays.It could be reduced by minimising the ship berth time

                                                                                                         ISSN 1947-5500
                                                                       (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                                                 Vol. 9,No.10, 2011
at wharf. From the sensitivity analysis, it was concluded                           Trade and Development,Geneva, Vol.3, 2009, pp.65-79.
that,even if a port well equipped port with state of the art                        [8] Nathan Huynh and Nathan Hutson, “Mining the Sources of Delay for
                                                                                    Dray Trucks at Container Terminals”, TRR, TRB 2066, 2008,pp. 41–49.
infrastructure,may face transhipment delay, due to          its                     [9] U.Bugaric and D.Petrovic, “Increasing the capacity of terminal for bulk
operational deficiencies such as issues related to work shifts,                     cargo unloading,” Simulation modelling practice and theory, vol.15, 2007,
labours discipline, insufficienct shunt trucks and cranes.                          pp. 1366–1381.
                                                                                    [10] Mohammed Ali, Alattan bila Varikarkae and Neelara jhsans,
                                                                                    “Simulation of container queues for port investment decision”, Proceedings
                                                                                    of ISORA’06, Xinjiang, 2006, pp.155-167.
                                                                                    [11] Kasypi Mokhtar and Dr.Muhammad zaly shah, “A regression model
                                                                                    for vessel Turnaround time”, Proceedings of TAICI, 2006,pp. 10-19.
                                                                                    [12] Simeon Djankov, Caroline Freund and Cong S. Pham, 2006, “Trading
                                                                                    on Time”, World Bank, Development research group, 2006,pp. 1-39.
                                                                                    [13] F. Soriguera, D. Espinet and F. Robuste, “Optimization of the internal
                                                                                    transport cycle in a marine container terminal managed by Straddle
                                                                                    carriers”, TRR (2006), TRB, 2007.
                                                                                    [14] Brian M. Lewis, Alan L. Erera and Chelsea C. White, “Impact o
                                                                                    Temporary Seaport Closures on Freight Supply Chain Costs”,TRR (2006),
                                                                                    Vol.1963 (1), pp. 64-70
                                                                                    [15] Rajeev Namboothiri and Alan L. Erera, “A set partitioning heuristic
                                                                                    for local drayage routing under time-dependent port delay”,7803-8566-/04,
                                                                                    2004 IEEE.
                                                                                    [16] H. Murat Celik, “Modeling freight distribution using artificial neural
                                                                                    networks”, Transport Geography, vol.12, 2004, pp.141- 148.
                                                                                    [17] Peter B. Marlow and Ana C. Paixão Casaca, “Measuring lean ports
                                                                                    Performance”, Transport Management, Vol.1 (4), 2003, pp.189-202.
                                                                                    [18] Rahim F. Benekohal, Yoassry M. El-Zohairy and Stanley Wang,
                                                                                    “Truck Travel Time around Weigh Stations Effects of Weigh in Motion
                                                                                    and Automatic Vehicle Identification Systems”, TRR 1716 _135,TRB
                                                                                    2000, pp. 138-143.
                                                                                    [19] Jose L. Tongzon, “Determinants of port performance and efficiency”,
                   Figure 5 Sensitivity analysis outputs                            TR,Part A, Vol.29 (3), 1995, pp.245-252.
                                                                                    [20] Http://
                           VI CONCLUSION                                            [21] Ports of India website; Http://
From the outputs of ANN, MNLR and MLR analysis, it was                              [22] Position paper on “The ports sector in India”,Dept. of economics
                                                                                    Affairs, Ministry of Finance, Government of India, 2009.
concluded that the prediction accuracy of the ANN model was
established from the R2 (0.87) and Correlation co-efficient                                                     AUTHORS PROFILE
(0.93). This paper discussed the application of datamining
techniques in predictive analysis of future delays to be faced                      P.OLIVER JAYAPRAKASH is at present Assistant professor in Civil
by Non-containerised cargo at Port berths. Further, it has a                        engineering department, Mepco schlenk engineering college, sivakasi,
scope of various issues connected with cargo transhipment in                        Tamilnadu. India. His field of interests includes,Soft computing applications
                                                                                    in Freight logistics planning. He is currently pursuing his Ph.D at Anna
the port sector.                                                                    University,Chennai under Dr.K.Gunasekaran.

                             REFERENCES                                             Dr.K.GUNASEKARAN is an Assiociate professor in Transportation engg.
                                                                                    Dvision of Anna University,chennai His research interests includes
[1] G. K. Ravi Kumar, B. Justus Rabi, Ravindra S. Hegadi, T.N. Manjunath            Simulation,Analysis and modeling of Accidents and its prevention, GIS
And R. A. Archana, “Experimental study of various data masking techniques           &GPS application inTraffic analysis and management. He published several
with random replacement using data volume”, IJCSIS-International Journal of         research articles in various Journals.
Computer Science and Information Security,Vol. 9, No.8, August 2011, pp.
154-157.                                                                            Dr.S.MURALIDHARAN is at present Professor in Electrical and Electronics
[2] Mohammad Behrouzian Nejad, Ebrahim Behrouzian Nejad and Mehdi                   engineering department,Mepco schlenk engineering college,sivakasi,
Sadeghzadeh, “Data Mining and its Application in Banking Industry, A                Tamilnadu. His research interests includes Fuzzy logic and Neural Network
Survey”, IJCSIS - International Journal of Computer Science and Information         application to power system planning,optimization and control problems. He
Security, Vol. 9, No. 8, 2011.                                                      published several research articles in varoius reputed international conferences
[3] I. Krishna Murthy, “Data Mining- Statistics Applications: A Key to              and journals.
Managerial Decision Making”,, socio - economic voices, 2010
11 pp. 1-11.
[4] A. Kusiak, “Data mining: manufacturing and service applications”,
International Journal of Production Research, Vol. 44, Nos. 18–19, 15th
September – 1st October 2006, pp. 4175–4191.
[5] Chang Qian Gua and Rong fang (Rachel) Liu, “Modeling Gate Congestion
of Marine Container Terminals-Truck Cost and Optimization,” TRR, TRB
No.2100, 2009, pp.58–67.
[6] Wenjuan Zhao and Anne V. Goodchild, “Impact of Truck Arrival
Information on System Efficiency at Container Terminals”TRR, TRB.2162,
2010, pp. 17–24.
[7] UNCTAD Transportation newsletter, “United Nations Conference on

                                                                                                                    ISSN 1947-5500

To top