Estrategias Avanzadas de Tabulación y Paralelismo en Programas Lógicos = Advanced Evaluation Strategies for Tabling and Parallelism in Logic Programs

Texto completo

(1)UNIVERSIDAD POLITÉCNICA DE MADRID FACULTAD DE INFORMÁTICA. Estrategias Avanzadas de Tabulación y Paralelismo en Programas Lógicos. Tesis Doctoral Pablo Chico de Guzmán Huerta Ingeniero en Informática Noviembre 2012.

(2)

(3) Departamento de Lenguajes y Sistemas e Ingenierı́a del Software Facultad de Informática Estrategias Avanzadas de Tabulación y Paralelismo en Programas Lógicos Candidato: Pablo Chico de Guzmán Huerta Ingeniero en Informática Universidad Politécnica de Madrid, España. Director:. Manuel Carro Liñares Doctor en Informática Licenciado en Informática Universidad Politécnica de Madrid. Director:. Manuel V. Hermenegildo Salinas Doctor en Informática Licenciado en Informática Universidad Politécnica de Madrid. Madrid, Noviembre 2012. This work is licensed under the Creative Commons Attribution-Share Alike 3.0 License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA..

(4)

(5) UNIVERSIDAD POLITECNICA DE MADRID. Tribunal nombrado por el Magfco. y Excmo. Sr. Rector de la Universidad Politécnica de Madrid, el dı́a. . . . . . de. . . . . . . . . . . . . . . . . . . . . . . . de 201. . . Presidente: Vocal: Vocal: Vocal: Secretario: Suplente: Suplente: Realizado el acto de defensa y lectura de la Tesis el dı́a . . . . . . de. . . . . . . . . . . . . . . . . . de 201. . . en la Facultad de Informática. EL PRESIDENTE. LOS VOCALES. EL SECRETARIO.

(6)

(7) A todos lo que me quieren, en especial a Ana y a mi familia..

(8)

(9) Agradecimientos Uno de los aspectos más importantes en el éxito de una tesis doctoral es el entorno de trabajo en el que se desarrolla y la gente con la que dı́a a dı́a se trabaja. En este sentido, he tenido la gran suerte de desarrollar mi tesis en la Universidad Politécnica de Madrid, dentro del grupo de investigación CLIP. Desde aquı́ aprovecho para dar las gracias a todos los integrantes de CLIP, gracias a ellos se mantiene el sistema Ciao Prolog que es el lenguaje donde he desarrollado esta tesis. Destacar de manera especial a Manuel Carro y a Manuel Hermenegildo, que son los directores de esta tesis, y que a su calidad humana unen excelsos conocimientos en las materias que desarrola esta tesis y sin los cuales hubiera sido imposible desarrollarla. También me gustarı́a agradecer a David Warren y a Peter Stuckey la posibilidad que me brindaron de realizar estancias en EE.UU y Australia, respectivamente. Gracias por vuestra hospitabilidad y por transmitirme parte de vuestra experiencia. Agradecer también el apoyo a las entidades que han colaborado en la financiación de mis estudios de doctorado, en particular, gracias a la Universidad Politécnica de Madrid, al Ministerio de Educación y Ciencia y al instituto IMDEA Software. Finalmente, también me gustarı́a agradecer el apoyo y compañı́a de mi familia, de mis amigos y especialmente de mi novia Ana. Gracias a todos por hacer de cada momento un momento especial..

(10)

(11) Sinopsis Dentro de los paradigmas de programación en el mundo de la informática tenemos la “Programación Lógica”, cuyo principal exponente es el lenguaje Prolog. Los programas Prolog se componen de un conjunto de predicados, cada uno de ellos definido por medio de reglas que aportan un elevado nivel de abstracción y declaratividad al programador. Sin embargo, las formulación con reglas implica, frecuentemente, que un predicado se recompute varias veces para la misma consulta y además, Prolog utiliza un orden fijo para evaluar reglas y objetivos (evaluación SLD) que puede entrar en “bucles infinitos” cuando ejecuta reglas recursivas declarativamente correctas. Estas limitaciones son atacadas de raı́z por la tabulación, que se basa en “recordar” en una tabla las llamadas realizadas y sus soluciones. Ası́, en caso de repetir una llamada tendrı́amos ya disponibles sus soluciones y la tabulación evitarı́a la recomputación de esta llamada. También evita “bucles infinitos” ya que las llamadas que los generan son suspendidas, quedando a la espera de que se computen soluciones para las mismas usando derivaciones alternativas. La implementación de la tabulación no es sencilla. En particular, necesita de tres operaciones que no pueden ser ejecutadas en tiempo constante simultáneamente. Dichas operaciones son: suspensión de llamadas, relanzamiento de llamadas y acceso a variables. La primera parte de la tesis compara tres implementaciones de tabulación sobre Ciao, cada una de las cuales penaliza una de estas operaciones. Por tanto, cada solución tiene sus ventajas y sus inconvenientes y se comporta mejor o peor dependiendo del programa ejecutado. La segunda parte de la tesis mejora la funcionalidad de la tabulación para combinarla con restricciones y también para evitar computaciones innecesarias. La programación con restricciones permite la resolución de ecuaciones como medio de programar, mecanismo altamente declarativo. Hemos desarrollado un framework para combinar la tabulación con las restricciones, priorizando objetivos como la flexibilidad, la eficiencia y la generalidad de nuestra solución, obteniendo una sinergia entre ambas técnicas muy útil en numerosas aplicaciones. Por otra parte, un aspecto fundamental de la tabulación hace referencia al momento en.

(12) que se retornan las soluciones de una llamada tabulada. Local evaluation devuelve soluciones sólo cuando todas las soluciones de la llamada tabulada han sido computadas. Por contra, batched evaluation devuelve las soluciones una a una conforme van siendo computadas, por lo que se adapta mejor a problemas donde no nos interesa encontrar todas las soluciones. Sin embargo, su consumo de memoria es exponencialmente peor que el de local evaluation. La tesis presenta swapping evaluation, un método alternativo que devuelve soluciones tan pronto como son computadas pero con un consumo de memoria similar a la de local evaluation. Además, se implementan operadores de poda para descartar la búsqueda de soluciones alternativas cuando encontramos la solución deseada. Por último, Prolog adopta con relativa facilidad soluciones para paralelismo gracias a su flexibilidad en el control de la ejecución y a que sus asignaciones son lógicas. La tercera parte de la tesis extiende el paralelismo conjuntivo de Ciao para trabajar con programas no deterministas, lo que presenta dos problemas principales: los objetivos atrapados y la recomputación de objetivos. Las soluciones clásicas para los objetivos atrapados rompı́an invariantes de la ejecución Prolog, siendo soluciones difı́ciles de mantener y de extender que han caı́do en desuso. Nosotros proponemos una solución modular (basada en la implementación de swapping evaluation), localizada y que no rompe los invariantes de la ejecución Prolog, pero que mantiene un alto rendimiento de la ejecución paralela. En referencia a la recomputación de objetivos paralelos en presencia de no determinismo hemos adaptado ténicas derivadas de la tabulación para memorizar computaciones de estos objetivos y evitar su recomputación..

(13) UNIVERSIDAD POLITÉCNICA DE MADRID FACULTAD DE INFORMÁTICA. Advanced Evaluation Strategies for Tabling and Parallelism in Logic Programs. PhD Thesis. Pablo Chico de Guzmán Huerta November 2012.

(14)

(15) Languages and Systems and Sowftware Engineering Department Computer Science School Advanced Evaluation Strategies for Tabling and Parallelism in Logic Programs PhD Candidate: Pablo Chico de Guzmán Huerta Computer Science Engineer Universidad Politécnica de Madrid, Spain. Advisor:. Manuel Carro Liñares Doctor in Computer Science Graduate in Computer Science Universidad Politécnica de Madrid. Co-advisor:. Manuel V. Hermenegildo Salinas Doctor in Computer Science Graduate in Computer Science Universidad Politécnica de Madrid. Madrid, November 2012 This work is licensed under the Creative Commons Attribution-Share Alike 3.0 License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA..

(16)

(17) To everyone who loves me, specially to Ana and to my family..

(18)

(19) Acknowledgements One of the most important things for the success of a PhD thesis is the work environment where the PhD student develops his works and the people who is working in this environment. In this sense, I have been very fortunate to be working in the CLIP Research group, which is responsible of maintaining the Ciao Prolog system, where I am being implementing most of my research. I would like to specially mention to Manuel Carro and Manuel Hermenegildo, who have been always very kind, and have shared with me their ideas and their knowledge. This thesis would not have been possible without them. I would also like to thank David Warren and Peter Stuckey for accepting me as a student in my USA and Australia stays, respectively. Thanks to be very welcoming and help me to do research in areas I were not familiar with. Thanks also to the financial institutions of my PhD research as: Technical University of Madrid, Spanish Science Ministry and IMDEA Software institute. Finally, I would like to thank my family, my friends and specially, my girlfriend. Thanks all to make each single moment of my life a very special one..

(20)

(21) Abstract Prolog is the main language in the Logic Programming paradigm. A Prolog program is a set of predicates whose definition is based on a set of clauses and whose evaluation selects which clauses have to be computed in order to reach a conclusion. This representation is very declarative and offers a highlevel of abstraction to the programmer. Since predicates can be recursive, the standard evaluation strategy of Prolog usually performs recomputations and can even enter infinite loops for programs which have a well-defined declarative meaning. Tabled evaluation attacks these issues by remembering the solutions for tabled predicates and by suspending looping predicate calls in order to find other answers by executing alternative clauses. Tabled evaluation implementation is complex. It relies on three different operations which cannot all be performed in constant time. These operations are: call suspension, call resumption and variable accesses. The first part of this thesis compares three different tabled evaluation implementations (implemented in Ciao), where each of them penalizes one of these three operations. Each implementation has advantages and disadvantages depending on the behavior of a specific tabled evaluation application. The second part of this thesis improves the functionality of tabled evaluation in order to combine it with constraints and to avoid unneeded computations. Constraint Logic Programming (CLP) is a natural extension of Logic Programming that applies efficient, incremental constraint solving techniques which blend seamlessly with the characteristics of logical variables and which increase the expressive power and declarativeness of Logic Programming. We present a complete implementation framework for constraint tabled evaluation, independent from the constraint solver, which enlarges the application domain of tabled evaluation. On the other hand, a key decision of a tabled evaluation is related to the moment when answers are returned to the tabled call. Local evaluation returns solutions only when all of them have been computed and has low memory consumption..

(22) Batch evaluation returns answers on-demand, which has a wider applicability, but its memory behavior is unacceptable for most of the applications. This thesis presents swapping evaluation, a method that returns answers on-demand but whose memory behavior is much closer to that of local evaluation. We also support pruning operators in order to prune the search when the desired solution is found. Last but not least, Prolog adapts very well to parallelism due to the flexibility of its execution control and the use of logical variables. The third part of this thesis extends the independent and-parallelism capability of Ciao to work with non-determinism programs. One of the major issues for this is the trapped goal problem and the recomputation of goals. The classical solutions for the trapped goal problem break several invariants of the Prolog abtract machine, affecting the implementation of the system everywhere. Consequently, they are difficult to maintain and to extend, which is why they tend not to be used. We propose a modular solution which does not break any invariant of the Prolog abstract machine while keeping the same performance for parallel executions (this solution is close related to the swapping evaluation). Respect to the recomputation of goals, we have adapted techniques from tabled evaluation for remembering previous computations in order to improve the performance..

(23) Contents Abstract. iii. 1 Introduction 1.1. 1. Logic Programming . . . . . . . . . . . . . . . . . . . . . . . . . .. 1. 1.1.1. The Prolog Language . . . . . . . . . . . . . . . . . . . . .. 2. 1.2. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 4. 1.3. Thesis Contributions . . . . . . . . . . . . . . . . . . . . . . . . .. 6. 1.4. Structure of the Thesis . . . . . . . . . . . . . . . . . . . . . . . .. 7. 1.4.1. Tabling Background . . . . . . . . . . . . . . . . . . . . .. 8. 1.4.2. Suspension-based Tabling Implementation Approaches . .. 8. 1.4.3. A General Implementation Framework for Tabled CLP . .. 8. 1.4.4. Swapping Evaluation under Tabled LP . . . . . . . . . . .. 9. 1.4.5. Pruning Operators under Tabled LP . . . . . . . . . . . .. 10. 1.4.6. Swapping Operation for Executing Trapped Computations. 10. 1.4.7. Memoization of Parallel Computations . . . . . . . . . . .. 11. 2 Tabling Background. 13. 2.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 13. 2.2. Tabled Evaluation by an Example . . . . . . . . . . . . . . . . . .. 14. 2.3. Scheduling Strategies . . . . . . . . . . . . . . . . . . . . . . . . .. 18. 2.4. Tabling Applications . . . . . . . . . . . . . . . . . . . . . . . . .. 19. 2.5. The Table Space for Tabling . . . . . . . . . . . . . . . . . . . . .. 19. 2.5.1. The Trie Data Structure . . . . . . . . . . . . . . . . . . .. 20. 2.5.2. Using Tries to Organize the Table Space . . . . . . . . . .. 21. v.

(24) 2.6. Tabling Explained as a Source to Source Transformation . . . . .. 23. 2.7. Protecting Consumer Memory from Backtracking . . . . . . . . .. 26. 2.7.1. Warren Abstract Machine . . . . . . . . . . . . . . . . . .. 26. 2.7.2. Implementation Details of protect consumer/1 . . . . . . .. 28. 3 Suspension-based Tabling Implementation Approaches. 33. 3.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 33. 3.2. Callable Copy Approach to Tabling . . . . . . . . . . . . . . . . .. 35. 3.2.1. What needs to be saved? . . . . . . . . . . . . . . . . . . .. 36. 3.2.2. Improving Consumer Resumption Complexity . . . . . . .. 37. 3.3. Optimized Copy Hybrid Approach to Tabling . . . . . . . . . . .. 39. 3.4. Multi-Value Binding Approach to Tabling . . . . . . . . . . . . .. 44. 3.4.1. 50. 3.5. 3.6. MVB Tabling Execution . . . . . . . . . . . . . . . . . . .. Performance Evaluation of Suspension-based Tabling Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 53. 3.5.1. Theoretical Complexity Analysis . . . . . . . . . . . . . .. 53. 3.5.2. Experimental Evaluation . . . . . . . . . . . . . . . . . . .. 57. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 59. 4 A General Implementation Framework for Tabled CLP. 61. 4.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 61. 4.2. Interaction Between Tabling and CLP. . . . . . . . . . . . . . . .. 63. 4.3. A General Framework for TCLP . . . . . . . . . . . . . . . . . . .. 65. 4.3.1. Constraint Global Table . . . . . . . . . . . . . . . . . . .. 65. 4.3.2. TCLP Program Transformation . . . . . . . . . . . . . . .. 67. 4.3.3. Consumer Suspension/Resumption . . . . . . . . . . . . .. 69. 4.3.4. Improvements to Constraint Domain Operations . . . . . .. 70. Some Samples of Constraint Solvers in TCLP . . . . . . . . . . .. 71. 4.4.1. Equality and Disequality Constraints . . . . . . . . . . . .. 71. 4.5. Difference Constraints . . . . . . . . . . . . . . . . . . . . . . . .. 72. 4.6. Experimental Performance Evaluation. . . . . . . . . . . . . . . .. 73. 4.7. Ciao TCLP versus TCHR / XSB . . . . . . . . . . . . . . . . . .. 74. 4.4. vi.

(25) 4.7.1. Timed Automata Applications . . . . . . . . . . . . . . . .. 76. 4.8. Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 77. 4.9. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 78. 5 Swapping Evaluation under Tabled LP. 81. 5.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 81. 5.2. Motivation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 83. 5.3. Improving Memory Usage by Precise Completion Detection. . . .. 83. 5.3.1. An Overview of ASCC Memory Behavior. . . . . . . . . .. 84. 5.3.2. A Solution: Imposing SCC Memory Behavior. . . . . . . .. 85. Swapping Evaluation: the General Idea. . . . . . . . . . . . . . .. 86. 5.4.1. External Consumers: More Stack Freezing than Needed. .. 86. 5.4.2. Swapping Evaluation: External Consumers No Longer Sus-. 5.4. pend. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 87. Experimental Performance Evaluation. . . . . . . . . . . . . . . .. 88. 5.5.1. First-Answer Queries . . . . . . . . . . . . . . . . . . . . .. 89. 5.5.2. All-Solution Queries. . . . . . . . . . . . . . . . . . . . . .. 89. 5.6. Combining Local and Swapping Evaluation. . . . . . . . . . . . .. 93. 5.7. Swapping Evaluation Implementation Details. . . . . . . . . . . .. 94. 5.8. Porting Swapping Evaluation to Ciao . . . . . . . . . . . . . . . . 100. 5.9. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103. 5.5. 6 Pruning Operators under Tabled LP. 105. 6.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105. 6.2. Issues for Supporting Pruning under Tabled LP . . . . . . . . . . 106. 6.3. 6.2.1. !/0 Operator under Tabled LP . . . . . . . . . . . . . . . . 107. 6.2.2. Behavior of once/1 . . . . . . . . . . . . . . . . . . . . . . 108. Applications of once/1 . . . . . . . . . . . . . . . . . . . . . . . . 110 6.3.1. Generate & Test Applications . . . . . . . . . . . . . . . . 110. 6.3.2. Early Completion Optimization . . . . . . . . . . . . . . . 111. 6.3.3. Pruning at the Top Level . . . . . . . . . . . . . . . . . . . 111. 6.3.4. If-Then-Else Prolog Transformation . . . . . . . . . . . . . 112 vii.

(26) 6.3.5 6.4. 6.5. Application to Minimization Problems . . . . . . . . . . . 112. Implementation Details of the once/1 Operator . . . . . . . . . . 113 6.4.1. Once Scope Data Structure . . . . . . . . . . . . . . . . . 113. 6.4.2. The Management of Once Scopes . . . . . . . . . . . . . . 114. 6.4.3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . 115. 6.4.4. The Pruning of a Once Scope . . . . . . . . . . . . . . . . 117. 6.4.5. Pruning Optimizations . . . . . . . . . . . . . . . . . . . . 117. Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . 120 6.5.1. Applications Searching an Answer Subset . . . . . . . . . . 121. 6.5.2. Early Completion based on once/1 . . . . . . . . . . . . . 122. 6.6. Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124. 6.7. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126. 7 Swapping Operation for Executing Trapped Computations. 127. 7.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128. 7.2. Restricted Independent And-Parallelism . . . . . . . . . . . . . . 130. 7.3. Unrestricted Independent And-Parallelism . . . . . . . . . . . . . 131. 7.4. Sketch of Nondeterministic Parallel Execution . . . . . . . . . . . 134. 7.5. The Trapped Goal Problem . . . . . . . . . . . . . . . . . . . . . 135. 7.6. Reordering Stacks to Free Trapped Goals . . . . . . . . . . . . . . 138 7.6.1. An Example of Stack Reordering . . . . . . . . . . . . . . 138. 7.6.2. Stack Reordering Algorithm . . . . . . . . . . . . . . . . . 140. 7.6.3. Some Low Level Details . . . . . . . . . . . . . . . . . . . 143. 7.7. Dealing with Garbage Slots . . . . . . . . . . . . . . . . . . . . . 144. 7.8. Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . 146. 7.9. 7.8.1. Deterministic and Non-Deterministic Benchmarks . . . . . 146. 7.8.2. Avoiding Trapped Goals: the Impact of Goal Precedence . 149. Other Applications for Stack Reordering . . . . . . . . . . . . . . 151. 7.10 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 8 Memoization of Parallel Computations 8.1. 153. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 viii.

(27) 8.2. An Overview of IAP with Parallel Backtracking . . . . . . . . . . 155. 8.3. An Execution Example . . . . . . . . . . . . . . . . . . . . . . . . 157. 8.4. Memoization vs. Recomputation . . . . . . . . . . . . . . . . . . . 160. 8.5. 8.6. 8.4.1. Answer Memoization . . . . . . . . . . . . . . . . . . . . . 160. 8.4.2. Combining Answers . . . . . . . . . . . . . . . . . . . . . . 163. Trapped Goals and Backtracking Order . . . . . . . . . . . . . . . 165 8.5.1. Out-of-Order Backtracking . . . . . . . . . . . . . . . . . . 166. 8.5.2. First Answer Priority and Trapped goals . . . . . . . . . . 167. The Scheduler for the Parallel Backtracking IAP Engine . . . . . 168 8.6.1. Looking for Work . . . . . . . . . . . . . . . . . . . . . . . 168. 8.6.2. Executing Parallel Conjunctions . . . . . . . . . . . . . . . 170. 8.7. Suspension of Speculative Goals . . . . . . . . . . . . . . . . . . . 170. 8.8. A Note on Deterministic Parallel Goals . . . . . . . . . . . . . . . 171. 8.9. Comparing Performance of IAP Models . . . . . . . . . . . . . . . 172. 8.10 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 9 Conclusions and Future Work. 177. 9.1. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177. 9.2. Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179. ix.

(28) x.

(29) List of Figures 1.1. Fibonacci program and its execution. . . . . . . . . . . . . . . . .. 4. 1.2. An infinite SLD evaluation. . . . . . . . . . . . . . . . . . . . . .. 5. 2.1. A successful tabled evaluation. . . . . . . . . . . . . . . . . . . . .. 16. 2.2. Using tries to represent terms. . . . . . . . . . . . . . . . . . . . .. 21. 2.3. Using tries to organize the table space. . . . . . . . . . . . . . . .. 22. 2.4. Initial state. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 29. 2.5. Frozen consumer heap. . . . . . . . . . . . . . . . . . . . . . . . .. 29. 2.6. Optimized freezing operation. . . . . . . . . . . . . . . . . . . . .. 30. 3.1. Initial state. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 38. 3.2. Frozen consumer frames. . . . . . . . . . . . . . . . . . . . . . . .. 38. 3.3. Consumer before suspension. . . . . . . . . . . . . . . . . . . . . .. 39. 3.4. CHAT version of create consumer/3. . . . . . . . . . . . . . . . . .. 40. 3.5. Backtracking after CHAT suspension. . . . . . . . . . . . . . . . .. 41. 3.6. CHAT version of resume consumer/1. . . . . . . . . . . . . . . . . .. 41. 3.7. Pseudo-code for OCHAT untrail. . . . . . . . . . . . . . . . . . .. 44. 3.8. Pseudo-code for MVB access. . . . . . . . . . . . . . . . . . . . .. 48. 3.9. Pseudo-code for MVB untrail. . . . . . . . . . . . . . . . . . . . .. 49. 3.10 Tabled program.. . . . . . . . . . . . . . . . . . . . . . . . . . . .. 50. 3.11 MVB tabling execution. . . . . . . . . . . . . . . . . . . . . . . .. 50. 3.12 MVB variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 51. 3.13 MVB trail management. . . . . . . . . . . . . . . . . . . . . . . .. 52. 4.1. 63. Looping, incomplete under variant tabling. . . . . . . . . . . . . . xi.

(30) 4.2. Constraint Global Table. . . . . . . . . . . . . . . . . . . . . . . .. 66. 4.3. The new program transformation for TCLP. . . . . . . . . . . . .. 68. 5.1. Tabled program.. . . . . . . . . . . . . . . . . . . . . . . . . . . .. 84. 5.2. ASCC memory behavior. . . . . . . . . . . . . . . . . . . . . . . .. 84. 5.3. SCC memory behavior. . . . . . . . . . . . . . . . . . . . . . . . .. 85. 5.4. Non-trivial scenario. . . . . . . . . . . . . . . . . . . . . . . . . .. 99. 5.5. Choicepoint management. . . . . . . . . . . . . . . . . . . . . . .. 99. 5.6. And-Or tree execution. . . . . . . . . . . . . . . . . . . . . . . . .. 99. 6.1. !/0 example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107. 6.2. !/0 inconsistency. . . . . . . . . . . . . . . . . . . . . . . . . . . . 107. 6.3. Solution order dependency. . . . . . . . . . . . . . . . . . . . . . . 107. 6.4. Constraint-based optimization. . . . . . . . . . . . . . . . . . . . . 112. 6.5. once/1 predicate.. 6.6. Once structures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116. 6.7. Consumer Optimizations. . . . . . . . . . . . . . . . . . . . . . . . 119. 6.8. bad xsb example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123. 7.1. Sequential program. . . . . . . . . . . . . . . . . . . . . . . . . . . 131. 7.2. Fork-join annotations for p/3. . . . . . . . . . . . . . . . . . . . . 131. 7.3. Predicate p/2 with an unrestricted operator-annotated clause. . . 134. 7.4. Example of execution state in IAP with trapped goals. . . . . . . 136. 7.5. Example of choicepoint reordering before executing a trapped goal. 139. 7.6. Choicepoint reordering algorithm in an agent’s stack set. . . . . . 141. 8.1. Execution of main/4 with memoization of answers and parallel. . . . . . . . . . . . . . . . . . . . . . . . . . . . 114. backtracking. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 8.2. Snapshot of agent’s stacks during answer memoization process. . . 162. 8.3. Trapped goal problem with ordered and out-of-order backtracking in IAP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165. 8.4. Parallel backtracking Prolog code. . . . . . . . . . . . . . . . . . . 169. xii.

(31) List of Tables 3.1. Complexity of CCAT, OCHAT and MVB. . . . . . . . . . . . . .. 55. 3.2. Performance evaluation of CCAT, OCHAT, MVB and XSB. . . .. 57. 3.3. Some statistics on the dynamic behavior of MVB variables. . . . .. 59. 4.1. Time comparison for TCLP frameworks in ms. . . . . . . . . . . .. 75. 4.2. Ciao TPLP vs. UPPAAL. . . . . . . . . . . . . . . . . . . . . . .. 76. 4.3. Non-Merging vs. Merging. . . . . . . . . . . . . . . . . . . . . . .. 76. 5.1. Time and memory comparison for first-answer queries. . . . . . .. 90. 5.2. Memory comparison for all-solution queries. . . . . . . . . . . . .. 91. 5.3. Time comparison for all-solution queries. . . . . . . . . . . . . . .. 92. 6.1. Time comparison of local and swapping evaluation with/without pruning operators. Times are in ms. . . . . . . . . . . . . . . . . . 122. 6.2. Time comparison of local using/not using early completion optimization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123. 7.1. Benchmark descriptions. . . . . . . . . . . . . . . . . . . . . . . . 147. 7.2. Trapped goal statistics. . . . . . . . . . . . . . . . . . . . . . . . . 148. 7.3. Speedup comparison: dependence analysis vs. trapped goals. . . . 150. 8.1. Comparison of speedups for several benchmarks and implementations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174. xiii.

(32) xiv.

(33) 1 Introduction Summary This chapter introduces the conceptual ideas behind the Logic Programming paradigm and analyzes the objectives of this thesis. The structure of this document and a brief description of each chapter are also presented. Finally, it provides a general perspective of the contributions of this thesis, including the related publications and the collaborations with other authors.. 1.1. Logic Programming. Logic Programming [82] is a programming paradigm based on a subset of a First Order Logic named Horn Clause Logic. Logic programming is a simple theorem prover that, given a theory (or program) and a query, uses the theory to validate that the query is satisfiable. A logic program consists of a collection of Horn clauses, which are usually written as: A : −B1 , B2 , . . . , Bn The literal A is defined as the head of the clause, while the conjunction B1 , B2 , . . . , Bn represents the body of the clause. Note that the head and the body of a rule are separated by the symbol ’:-’ (read as if ) and the subgoals in the body of a rule are separated by the symbol ’,’ (read as and ). Each Bi is called a subgoal. If the head of a clause is empty, then the clause is called a query. If the body 1.

(34) 2. Chapter 1. Introduction. is empty, then the clause is called a fact. If the head and the body are both non-empty, then the clause is called a rule. A sequence of clauses with the same functor in the head form a predicate. Thereby, predicates can be formed with facts and/or rules. A logic program represents the theory for a problem solution and the particular solutions are found by the computation of the program. In general, the computation of a logic program determines if a user query can be derived from the theory that the logic program represents. Logic programming is often said to include the following advantages [16]: Simple declarative semantics: a logic program is simply a collection of predicate logic clauses with an intuitive interpretation. Simple procedural semantics: a logic program can be read as a collection of recursive procedures. High expressive power: logic programs can be seen as executable specifications that, despite their simple procedural semantics, allow for designing complex and efficient algorithms. Inherent non-determinism: since in general several clauses can match a goal, problems involving search are easily programmed in these kind of languages.. 1.1.1. The Prolog Language. Prolog is arguably the most popular logic programming language. Prolog was made a viable language when in 1977 David Warren developed the first Prolog compiler. This system showed good performance, comparable to the best Lisp implementations of that time. Later, Warren proposed a new abstract machine for executing compiled Prolog code known as the Warren Abstract Machine, or simply the WAM [133]. The WAM became the most popular way of implementing Prolog and almost all current Prolog systems are based on WAM technology. The advances made in the technology of sequential compilation of Prolog implementations in the last two decades allow state-of-the-art Prolog systems to be highly efficient with comparative performance to imperative languages such as.

(35) 1.1. Logic Programming. 3. C [132]. Prolog has been successfully applied in areas such as natural language, artificial intelligence, deductive database, or expert systems. The computation process of Prolog is mainly based on two mechanisms: unification [109] and resolution. Unification is an operation that finds the most general common instance of two Prolog terms. A Prolog term is either a constant (also called atom), a compound term or a variable. Compound terms are structured data objects of the form f (t1 , . . . , tn ), where f /n is a functor with arity n and each ti is also a term. After the unification of two terms, some of their variables are instantiated. The logical nature of Prolog variables implies that they can be instantiated only once (destructive assignments are not allowed). The resolution mechanism of Prolog is known as SLD resolution [82]. SLD resolution is a top-down resolution mechanism, where subgoals in a query are unified with the head of a clause generating a new query called the resolvent. The resolvent is formed by the body (after unification) of the matching clause and by the remainder subgoals in the initial query. This process is recursively applied until either a subgoal fails to find a clause with which unify (failure), or until an empty query is generated (success). After failure, execution backtracks in order to try another derivation which might satisfy the original query. This process, known as backtracking, undoes the instantiations (or bindings) of logical variables done by previous unification operations. SLD resolution imposes a fixed selection function: the leftmost subgoal of a query is always selected first. If several alternative head clauses can be unified with this subgoal, Prolog systems use the order of the clauses in the program. When the computation fails, the execution backtracks to the previous state where it had left unexplored alternative clauses. The search tree is thus explored from top to bottom and in a left to right manner. To better illustrate how Prolog works, Figure 1.1 (on the left side) shows a small Prolog program that implements the well-known Fibonacci function. The program includes 3 clauses that form the fib/2 predicate, where the first argument, N , is the input number and the second argument is the output that computes the Fibonacci number N . The first and second clauses simply state that the Fibonacci of 1 and 2 (input arguments) is 1 (output argument). The.

(36) 4. Chapter 1. Introduction ?− fib(4,R).. fib(1,1). fib(2,1). fib(N ,R) :N > 2, N 1 is N − 1, N 2 is N − 2, fib(N 1,R1), fib(N 2,R2), R is R1 + R2.. 1. fail. 2. fail. 3. fib(3,R), fib(2,R), R is R1 + R2. 4. fail. 5. fail. 6. fib(2,R11), fib(1,R12), R1 is R11 + R12, fib(2,R2), R is R1 + R2.. 7. fail. 8. fib(1,R12), R1 is 1 + R12, fib(2,R2), R is R1 + R2.. 9. R1 is 1 + 1, fib(2,R2), R is R1 + R2.. 10. fib(2,R2), R is 2 + R2.. 11. fail. 12. R is 1 + 2.. 13. R = 3.. Figure 1.1: Fibonacci program and its execution.. third clause is the recursive rule that computes the Fibonacci function. Initially, it checks if the input argument N is greater than 2 and, if this is the case, it calls recursively itself twice with the first argument set to N − 1 and N − 2. The final result is the sum of the results obtained by these two calls. Variable names start with capital letters and names for constants start with lower case letters. Figure 1.1 (right-side) shows the execution sequence for the query fib(4,R).. 1.2. Motivation. A major problem with Prolog is that SLD resolution presents some fundamental limitations when dealing with recursions and redundant sub-computations. One of the foundations of Logic Programming is that a logic program should be independent from the execution control, but the limitations of SLD resolution force Prolog programmers to take SLD semantics into account during the program development process. For example, the program in Figure 1.1 recomputes fib(2,R) several times, which could have been an arbitrarily large computation..

(37) 1.2. Motivation. 5 ?− path(1,B).. 1. edge(1,C), path(C,B).. path(A,B) :edge(A,C), path(C,B).. 2. path(2,B).. path(A,B) :edge(A,B).. 3. edge(2,C), path(C,B).. edge(1,2). edge(2,1).. 4. fail. 5. path(1,B).. infinite loop. Figure 1.2: An infinite SLD evaluation.. Also, it is quite common for logically correct programs to enter infinite loops, as the program in Figure 1.2 illustrates. This program defines a small directed graph, represented by the edge/2 predicate, with a relation of reachability given by the path/2 predicate. Consider the query goal path(1,B). Using SLD resolution to solve this query leads us to an infinite loop because the first clause of path/2 recursively calls path(1,B). Thereby, the past years have seen wide effort. at increasing the declarativeness and expressiveness of Prolog. One of these solutions is the use of tabling [24] (or tabled LP). In a nutshell, tabling consists of storing intermediate answers for subgoals. These answers can be reused when a repeated subgoal appears during the resolution process in order to avoid the subgoal recomputation or the execution of an infinite loop. Therefore, tabling improves the termination properties of Prolog and can improve efficiency in programs which repeatedly perform some computation. These characteristics help make logic programs less dependent on clause and goal order, thereby bringing operational and declarative semantics closer together. Consequently, tabling has been successfully applied in many areas including deductive databases, pro-.

(38) 6. Chapter 1. Introduction. gram analysis, or semantic Web reasoning, to name a few. Our motivation is to make tabling compatible with other powerful mechanisms of Logic Programming in general, and Prolog in particular, in order to enlarge the application domain of tabling. We work on the development of answer on-demand resolution strategies with support for pruning operators (current tabling systems do not behave well here) and on the combination of tabling with constraint logic programming. These improvements on the tabling functionality should be done with the best possible performance. We explore different tabling implementation approaches to analyze the pros and cons of them, which will also give us a better understanding of the tabling execution. Tabling implementations are done at the level of the WAM, which make them a tedious task. We will prioritize the adoption of modular solutions in order not to affect the WAM invariants, which improves the extensibility and the maintainability of the system. Finally, an orthogonal way to improve performance is parallelism. We get advantage of some of the tabling techniques we have developed in order to be adapted for the implementation of independent and-parallelism with nodeterminism. Again, parallel implementations are complex and we will prioritize modular solutions while keeping the best possible performance.. 1.3. Thesis Contributions. This section summarizes the main contributions of this thesis: • Study of different approaches for tabling implementation in order to provide the best support for tabling in Ciao [68]. Tabling is based on three different operations which cannot all be performed in constant time. Thereby, this thesis develops three different tabling implementations (each of them penalizes each of these operations) in order to experimentally evaluate the practical behavior of each tabling approach. • One of the extended functionality of LP which has attracted more interest is the use of constraints (CLP). The combination of tabled evaluation with constraints is not trivial since the notion of implication between subgoals.

(39) 1.4. Structure of the Thesis. 7. cannot be syntactically deduced. This thesis presents a general implementation framework for the combination of tabling with constraints. • Tabled evaluation executes a fixpoint algorithm in order to compute all possible solutions. The most successful tabling implementations are based on returning solutions only after all the solutions have been computed. This is trivially inefficient for cases where only a subset of the answers is demanded. This thesis develops an answer on-demand tabled evaluation strategy which returns answers as soon as they are computed. Other answer on-demand tabled evaluation strategies show a very inefficient memory behavior. The proposed answer on-demand tabled evaluation strategy overcomes this drawback. • Answer on-demand tabling evaluation is not enough by itself to prune alternative execution paths if a sufficient answer is found. Pruning under tabled evaluation presents several issues which have not been deeply analyzed so far. This thesis analyzes these issues and presents a practical implementation of pruning under tabled evaluation. • One of the main characteristics of the Logic Programming paradigm is its flexibility to adapt solutions for parallelism. On the other hand, the implicit non-determinism of Prolog gives a challenging scenario which make the parallel solutions for Prolog much more complex than expected. This thesis proposes the application of ideas generated in the context of tabling in order to facilitate and improve the implementation of and-parallelism with non-determinism in the context of Prolog.. 1.4. Structure of the Thesis. This thesis is divided into three conceptual parts. The first part (Chapter 2 and 3) provides a background on tabling and studies different approaches for tabling implementation. The middle part (Chapters 4, 5 and 6) extends the functionality of tabling with common techniques of Logic Programming. The final part.

(40) 8. Chapter 1. Introduction. (Chapter 7 and 8) applies ideas from tabling implementations to the (efficient) implementation of and-parallelism with non-determinism. We introduce each of these chapters in the next sections.. 1.4.1. Tabling Background. The first part of Chapter 2 introduces the general ideas of tabling, a resolution strategy that overcomes some of the limitations of SLD resolution. Tabling is presented by an evaluation example and some applications are given. The second part of this chapter explains tabling in more detail. In particular, we give explanations about the tabled space and about a high-level program transformation for the execution of tabled programs in order to facilitate the understanding of the following chapters.. 1.4.2. Suspension-based Tabling Implementation Approaches. Chapter 3 analyses the most complex operations for tabling implementation. In particular, there are three different operations that cannot be all performed in constant time: consumer suspension, consumer resumption and variable access. We propose three different tabling implementations where each of these operations is penalized. CCAT, a low-level implementation of our work presented in [26, 25], penalizes the consumer suspension operation. OCHAT penalizes the consumer resumption operation and MVB, a work presented in [29], penalizes the variable access operation. Since these implementations are not time-comparable from a theoretical point of view, at the end of this chapter we make an experimental performance evaluation of them using a set of common tabling benchmarks in order to determine which one is the most adequate tabling implementation.. 1.4.3. A General Implementation Framework for Tabled CLP. Chapter 4 has been published in [27] and is the result of an eight-month stay at the University of Melbourne with Professor Peter Stuckey. It describes a framework to combine tabling evaluation and constraint logic programming (TCLP). While.

(41) 1.4. Structure of the Thesis. 9. this combination has been studied previously from a theoretical point of view and some implementations exist, they either suffer from a lack of efficiency, flexibility, or generality, or have inherent limitations with respect to the programs they can execute to completion (either with success or failure). Our framework addresses these issues directly, including the ability to check for answer / call entailment, which allows it to terminate in more cases than other approaches. The proposed framework is experimentally compared with existing solutions in order to provide evidence of the mentioned advantages.. 1.4.4. Swapping Evaluation under Tabled LP. Chapter 5 has been published in [30] and it is the result of a three-month stay at Stony Brook University with Professor D. S. Warren. One of the differences among the various approaches to suspension-based tabled evaluation is the scheduling strategy. The two most popular strategies are local and batched evaluation. The former collects all the solutions to a tabled predicate before making any one of them available outside the tabled computation. The latter returns answers one by one before computing them all, which in principle is better if only one answer (or a subset of the answers) is desired. Batched evaluation is closer to SLD evaluation in that it computes solutions lazily as they are demanded, but it may need arbitrarily more memory than local evaluation, which is able to reclaim memory sooner. Some programs which in practice can be executed under the local strategy, quickly run out of memory under batched evaluation. This has led to the general adoption of local evaluation at the expense of the more depth-first batched strategy. This chapter studies the reasons for the high memory consumption of batched evaluation and proposes a new scheduling strategy which we have termed swapping evaluation. Swapping evaluation also returns answers one by one before completing a tabled call, but its memory usage can be orders of magnitude smaller than the one of batched evaluation..

(42) 10. 1.4.5. Chapter 1. Introduction. Pruning Operators under Tabled LP. Chapter 6 has been accepted for publication in PADL’13. It discusses the issues which appear in order to support pruning operators under tabled LP. A novel version of the once/1 pruning operator for tabled LP is introduced. The once/1 operator, together with answer-on-demand strategies, makes it possible. to avoid computing unneeded solutions for a certain type of problems which can benefit from tabled LP but in which only a single solution is needed. Model checking and planning are examples of that broad class of applications in which all-solution evaluation strategies, such as local evaluation, perform unnecessary work. The proposed version of once/1 is also directly applicable to the efficient implementation of other optimizations, such as early completion, cut-fail loops (to, e.g., prune at the top-level), if-then-else statements and constraint-based branch-and-bound. Although once/1 still has open issues such as dependencies of tabled solutions on program history, our experimental evaluation confirms that the combination of swapping evaluation and once/1 provides an arbitrarily large efficiency improvement in several application areas.. 1.4.6. Swapping Operation for Executing Trapped Computations. Chapter 7 has been published in [32]. It considers the problem of supporting goallevel, independent and-parallelism (IAP) in the presence of non-determinism. IAP is exploited when two or more goals which will not interfere at run time are scheduled for simultaneous execution. Backtracking over non-deterministic parallel goals runs into the well-known trapped goal and garbage slot problems. The proposed solutions for these problems generally require complex low-level machinery which makes systems difficult to maintain and extend, and in some cases can even affect sequential execution performance. This chapter proposes a novel solution to the problem of trapped nondeterministic goals and garbage slots which is based on the tabling swapping operation, offering several advantages over previous proposals. While the implementation of this operation itself is not simple, in return it does not impose constraints on the scheduler. As a result, the scheduler and the rest of the run-time machinery can safely ignore the trapped.

(43) 1.4. Structure of the Thesis. 11. goal and garbage slot problems and their implementation is greatly simplified. Also, standard sequential execution remains unaffected. In addition to describing the solution we report on an implementation and provide performance results.. 1.4.7. Memoization of Parallel Computations. Chapter 8 has been published in [31]. The most successful IAP implementations to date have used recomputation of answers and sequentially ordered backtracking. While in principle simplifying the implementation, recomputation can be very inefficient if the granularity of the parallel goals is large enough and they produce several answers, while sequentially ordered backtracking limits parallelism. And, despite the expected simplification, the implementation of the classic schemes has proved to involve complex engineering, with the consequent difficulty for system maintenance and extension. This chapter presents an alternative parallel backtracking model for IAP and its implementation. The model features parallel out-of-order (i.e., non-chronological) backtracking and relies on answer memoization to reuse and combine answers. We show that this approach can bring significant performance advantages and, in our experience, it is not harder to implement than existing approaches..

(44) 12. Chapter 1. Introduction.

(45) 2 Tabling Background Summary The first part of this chapter introduces the general ideas of tabling, a resolution strategy which overcomes some of the limitations of SLD resolution. Tabling is presented by an evaluation example and some applications are given. The second part of this chapter explains tabling in more detail. In particular, we provide a detailed explanation of the tabled space and a give high-level program transformation for the execution of tabled programs in order to facilitate the understanding of the following chapters.. 2.1. Introduction. A proposal that overcomes some of the limitations of SLD resolution and therefore improves the declarativeness and expressiveness of Prolog is the use of tabling [127, 24]. In a nutshell, tabling consists of storing intermediate answers for subgoals declared as tabled so that they can be reused when a repeated call to a tabled subgoal appears during the resolution process. Note that non-tabled subgoals are executed as usual. Resolution strategies based on tabling are able to reduce the search space, avoid looping, and have better termination properties than traditional Prolog models based on SLD resolution. The basic idea behind tabling evaluation is straightforward: whenever a tabled subgoal is first called, a new entry is allocated in an appropriate data space called 13.

(46) 14. Chapter 2. Tabling Background. the table space. Table entries are used to verify whether calls to subgoals are repeated and to collect the answers found for their corresponding subgoals. Repeated calls to tabled subgoals are not re-evaluated against the program clauses; instead, they are resolved by consuming the answers already stored in their table entries. During this process, as new answers are found, they are stored in their tables and later returned to all repeated1 calls. Within this model, the nodes in the search space are classified as either: generator nodes, corresponding to first calls to tabled subgoals; consumer nodes, corresponding to repeated calls to tabled subgoals; or interior nodes, corresponding to non-tabled subgoals.. 2.2. Tabled Evaluation by an Example. Figure 2.1 shows a similar program to the one of Figure 1.2 to illustrate the main principles of tabled evaluation. At the top, the figure shows the program code (the left box) and the final state of the table space (the right box). The declaration :- table path/2 in the program code indicates that calls to predicate path/2 should be tabled. The sub-figure below shows the tabled evaluation for. the query goal path(1,B). This evaluation is split into three different execution trees. Generator nodes are depicted by black oval boxes, and consumer nodes by white oval boxes. Remember that SLD resolution enters an infinite loop because the first clause of path/2 leads to a repeated call to path(1,B). In contrast, as we will see, termination is ensured with tabled evaluation. The evaluation starts by adding a new entry to the table space and by allocating a generator node to represent path(1,B). Next, path(1,B) is resolved against the first clause for path/2, calling edge(1,C) (step 2). The edge/2 predicate is then resolved as usual because it is not tabled. The first clause for edge(1,C) succeeds with C = 2, and in the continuation path(2,B) is called (step 3). As this is the first call to path(2,B), a new entry to the table is added, and proceed by allocating a new generator node as shown in the bottommost tree. Again, path(2,B) is resolved against the first clause for path/2, calling edge(2,C) (step 4). The first clause for 1. A subgoal repeats a previous subgoal if they are the same up to variable renaming..

(47) 2.2. Tabled Evaluation by an Example. 15. edge(2,C) fails (step 5), but the second succeeds creating consumer node (step 6).. Since path(1,B) is a repeated call to the initial subgoal, no new tree is created, and instead, execution tries to consume answers from the table. At this point, the table does not have answers for path(1,B), and thus, the current evaluation is suspended. Consumers must suspend because new answers may still be found for the corresponding call. The only possible action after suspending is to backtrack to generator node 3. Execution tries the second clause for path(2,B), thus calling edge(2,B) (step 7). The first clause for edge(2,B) fails (step 8), but the second one succeeds obtaining a first answer for path(2,B) which is inserted into the table space (step 9). Execution follows a Prolog-like strategy and continues forward execution, returning to the context of the path(1,B) generator. The binding B = 1 is propagated, obtaining a first answer for path(1,B) (step 10) and a first solution for the query goal (step 11). Execution returns to the context of the generator node 3, which has no more clauses left to try. Thus, its completion fixpoint computation is executed. The completion fixpoint computation resumes consumers in the execution tree of a generator node in order to consume available answers until no more answers are generated.2 The consumer node 6 is then resumed with the now available answer B = 1. The execution succeeds with a new answer for path(2,B) (step 12). However, this answer repeats the one which was found in step 9. Tabled resolution does not store duplicate answers in the table. Instead, repeated answers fail. This is how unnecessary computations are avoided, and even looping in some cases. At this point, all available answers have been consumed by consumer node 6 and, thus, the consumer is suspended again. Execution thus backtracks to generator node 3. Its completion fixpoint computation is finished and then, it is checked whether path(2,B) can be completed.3 It can not, because it depends on the previous generator path(1,B) (node 6). Completing path(2,B) earlier is not safe 2. To this end, the subgoal frame of a generator keeps track of the consumer that appear in. its execution subtree. 3 A node is completed when no more answers can be found. It is a non-trivial operation, we will talk about it later..

(48) 16. Chapter 2. Tabling Background :- table path/2.. Subgoal. Answers 10. B = 1. path(A, B):edge(A, C), path(C, B). path(A, B):- edge(A, B).. 1. path(1, B). 15. B = 2 20. Complete 9. B = 1. edge(1, 2). edge(2, 1).. 3. path(2, B). 18. B = 2 20. Complete. ?− path(1,B). 11. B=1 16. B=2. 21. no. 1. path(1,B).. 2. edge(1,C), path(C,B). 3. path(2,B).. 13. fail. 20. complete. 14. edge(1,B). 15. B=2. 17. fail. 10. B=1 19. fail (B=2) 3. path(2,B).. 4. edge(2,C), path(C,B). 5. fail. 6. path(1,B). 12. fail (B=1). 20. complete. 7. edge(2,B). 8. fail. 9. B=1. 18. B=2. Figure 2.1: A successful tabled evaluation.. because, at this point, new answers can still be found for subgoal path(1,B). If new answers are found, node 6 should be resumed with the newly found answers,.

(49) 2.2. Tabled Evaluation by an Example. 17. which in turn can lead to further answers for subgoal path(2,B). If the generator is completed sooner, execution can lose such answers. Execution thus backtracks to node 2, fails in step 13 and returns to node 1. Execution tries the second clause for path(1,B), thus calling edge(1,B) (step 14). The first clause for edge(1,B) succeeds with B = 2 obtaining a new answer for path(1,B) (step 15) and a new solution for the query goal (step 16). In the continuation, the second clause for edge(1,B) fails (step 17) and backtracking sends us back to node 1. Node 1 has no more clauses left to try, so its completion fixpoint computation is performed. Consumer node 6 can be resumed as it has new unconsumed answers, and the new answer B = 2 is thus forwarded to it. This gives new answers to path(1,B) (step 18) and to path(1,B) (step 19). However, this last answer repeats the one which was found in step 10, so execution fails and backtracks again to node 1. The completion fixpoint computation of the generator path(1,B) is now finished. As this subgoal does not depend on any other subgoal, it is sure no more answers are forthcoming. Therefore, the two generators are completed (step 20) and execution returns no to the query goal (step 21). One of the major characteristics of this execution model is that it can ensure termination for a wider class of programs. In particular, tabling ensures termination for program with the bounded term size property — those programs where the sizes of subgoals and answers produced during an evaluation are less than some fixed number. This makes it much easier to reason about termination than in basic Prolog and can be useful when dealing with applications with recursive predicates, such as the path/2 predicate, that can lead to infinite loops. Moreover, as tabling-based models are able to avoid re-computation of tabled subgoals, they can reduce the search space and the complexity of a program. This latter property can be explored as a mean to speedup the execution. Consider again the Fibonacci program defined in Figure 1.1. If predicate fib/2 is declared as tabled, each different subgoal call is only computed once, as for repeated calls the corresponding answer is already stored in the table space. To compute fib(n,R) for some integer n, SLD resolution will search a tree whose size is exponential in n. Because tabling remembers sub-computations, the number of resolution steps.

(50) 18. Chapter 2. Tabling Background. for this example is linear in n.. 2.3. Scheduling Strategies. It should be clear that at several points we can choose between continuing forward execution, backtracking to interior nodes, returning answers to consumer nodes, or performing completion. The decision on which operation to perform is crucial to system performance and is determined by the scheduling strategy. Different strategies may have a significant impact on performance, and may lead to a different ordering of solutions to the query goal. Arguably, the two most successful tabling scheduling strategies are batched scheduling and local scheduling [52]. Batched scheduling is the strategy we followed in Figure 2.1: it favors forward execution first, backtracking next, and consuming answers or completion last. As a result, it schedules the program clauses in a depth-first manner as does the WAM. It thus tries to delay the need to move around the search tree by batching the return of answers. When new answers are found for a particular tabled subgoal, they are added to the table space and the evaluation continues. For some situations, this results in creating dependencies to older subgoals, therefore delaying the completion point to an older generator node. When backtracking we may find three situations: (i) if backtracking to a generator or interior node, we try the next clause; if a generator has no more clauses left to try we execute its completion fixpoint computation (iii) if backtracking to a consumer node, we try the next unconsumed answer; (iv) if a consumer has no more unconsumed answers, we simply backtrack to the previous node on the current branch. Local scheduling is an alternative tabling scheduling strategy that tries to evaluate subgoals as independently as possible. In this strategy, evaluation is done in a generator subtree at a time. The key idea is that whenever new answers are found, they are added to the table space as usual but execution fails. Thus, execution explores the whole generator subtree before propagating answers to the generator continuation. Coming back to the previous tabled evaluation, we would fail at step 9, prioritizing the completion of generator node 3 against the answer propagation. Hence, answers are only returned when the completion fixpoint.

(51) 2.4. Tabling Applications. 19. computation of a generator is finished. Since local scheduling completes subgoals sooner, we can expect less dependencies between subgoals.. 2.4. Tabling Applications. The previous sections have shown that tabling significantly expand the types of programming that can be done in Prolog by allowing recursions to be coded in more declarative way. Tabling improves termination properties of Prolog and its better declarativeness allows easier program analysis over tabled programs than over the equivalent non-tabled ones. Thereby, robust implementation of tabling have led to a profusion of research and commercial applications, including program verification [102, 49, 89, 101, 74, 94, 98, 117], program analysis [43, 12, 33, 72, 115], natural language analysis and data standardization [78, 102, 111, 38, 41], agent implementations [4, 81, 73, 79, 80, 125], semantic web [97, 42, 128, 124, 123, 139, 11, 48], diagnosis [22, 3, 9], medical informatics [54, 91], machine learning [76, 93] and software engineering [95, 120, 92, 103]. Many other commercial applications have been developed by XSB, Inc., Medical Decision Logics, Inc (www.mdlogix.com), Ontology Works (www.ontologyworks.com) and other companies. All of these applications demonstrate that tabled LP is a vibrant field of research, involving numerous Prologs including Ciao.. 2.5. The Table Space for Tabling. This section describes a key module for tabling implementation: the one which implements the table space. This module is very important since the correct design of the algorithms to access and manipulate the table data is critical to achieve an efficient implementation. We explain a solution for the tabled space based on tries, as proposed by [105]..

(52) 20. Chapter 2. Tabling Background. 2.5.1. The Trie Data Structure. Tries were first proposed by [51], the name coming from the central letters of the word retrieval. Tries were originally invented to index dictionaries, and have been generalized to index recursive data structures such as terms. Please refer to [105, 8] for the use of tries in automated theorem proving, term rewriting and tabled logic programs. An essential property of the trie structure is that common prefixes are represented only once. The effectiveness of the memory consumption of a particular trie depends proportionally on the percentage of terms that have common prefixes. For (tabled) logic programs, which recursively construct answers, we often can take advantage of common prefixes. A trie is a tree structure where each different path through the trie data units, the trie nodes, corresponds to a term. At the entry point we have the root node. Internal nodes represent symbols in terms and leaf nodes specify the end of terms. Each root-to-leaf path represents a term described by the symbols labeling the nodes traversed. Two terms with common prefixes will branch off from each other at the first distinguishing symbol. When inserting a new term, the trie is traversed starting at the root node. Each child node specifies the next symbol to be inspected in the input term. A transition is taken if the symbol in the input term at a given position matches a symbol on a child node. Otherwise, a new child node representing the current symbol is added and an outgoing transition from the current node is made to point to the new child node. On reaching the last symbol in the input term, we reach a leaf node in the trie. Figure 2.2 presents an example for the insertion of three different terms in a trie structure. Initially, the trie contains the root node only. Next, f(X,a) is inserted. As a result, three nodes are created: one for the functor f/2, next for the variable X (which is renamed as explained in the next paragraph), and last for the constant a (Figure 2.2(a)). The second step is to insert g(X,b,Y ). Since the two terms differ on the main functor, tries bring no benefit here (Figure 2.2(b)). In the last step, f(Y ,1) is inserted and the two nodes common with term f(X,a) are saved (Figure 2.2(c)). An important point when using tries to represent Prolog terms is the treat-.

(53) 2.5. The Table Space for Tabling (a). 21 (b). root node. (c) root node. root node. f/2. g/3. f/2. g/3. f/2. VAR0. VAR0. VAR0. VAR0. VAR0. a. b. a. b. Set of Terms f(X,a). VAR1. Set of Terms f(X,a) g(X,b,Y). VAR1. 1. a. Set of Terms f(X,a) g(X,b,Y) f(Y,1). Figure 2.2: Using tries to represent terms.. ment of variables. We follow the formalism proposed by [8], where each variable in a term is represented as a distinct constant. Formally, this corresponds to a function, numbervar(), from the set of variables in a term t to the sequence of constants VAR0,...,VARN, such that numbervar(X)<numbervar(Y ) if X is encountered before Y in the left-to-right traversal of t. For example, in the term g(X,b,Y ), numbervar(X) and numbervar(Y ) are respectively VAR0 and VAR1. On the other. hand, in terms f(X,a) and f(Y ,1), numbervar(X) and numbervar(Y ) are both VAR0. This is why the child node VAR0 of f/2 from Figure 2.2(c) is common to both terms.. 2.5.2. Using Tries to Organize the Table Space. We next describe how tries are used to implement the table space. Figure 2.3 shows an example for a tabled predicate f /2 after the execution of the following tabling operations: tabled tabled tabled tabled tabled. subgoal call: f(X,a) subgoal call: f(Y ,1) new answer: f(0,a) new answer: f(a,1) new answer: f(b,1).

(54) 22. Chapter 2. Tabling Background SG_TRIE Subgoal Trie Structure. root node. f/2. VAR0. 1. a. Subgoal frame for call f(VAR0,1). Subgoal frame for call f(VAR0,a). root node. root node. f/2. f/2. b. a. 0. 1. 1. a. Answer Trie Structure. Answer Trie Structure. Figure 2.3: Using tries to organize the table space.. We use two levels of tries: one stores the subgoal calls and the other the answers of a particular subgoal call. Each different call to a tabled predicate corresponds to a unique path through the subgoal table. Such a path always starts from the root node in this trie, the SG TRIE variable, follows a sequence of subgoal trie nodes, and terminates at a leaf data structure, the subgoal frame. Each subgoal frame stores information about the subgoal, namely an entry point to its answer table. Each unique path through the answer trie nodes corresponds to a different answer to the entry subgoal..

(55) 2.6. Tabling Explained as a Source to Source Transformation. 2.6. 23. Tabling Explained as a Source to Source Transformation. The purpose of this section is purely pedagogical in order to enhance understanding of the basic actions of tabling. This is done by presenting a simple source to source transformation that affects only tabled predicates. While it is the case that such a source to source transformation hides certain aspects of a tabling implementation, it also illustrates clearly that it is possible to implement tabling by adding a set of (admittedly quite complex) built-in predicates to an existing Prolog implementation. On the other hand, the source to source transformation gives a good high level description of the actions to be taken by a tabling system. It is then instructive to link these actions to those of the different tabling implementation approaches presented later in Chapter 3. An example is used in order to show the source to source transformation — the generalization of the transformation is straightforward. Let t/1 be a tabled predicate defined as follows: :- table t/1. t(X) :- body1 (X). t(X) :- body2 (X). t/1 is source-transformed to the following: t(X) :lookup call(t(X),SF ), ( SF.status == new −> tabled t(X,SF ) ; SF.status == complete −> consume answers(t(X),SF ) ; create consumer(t(X),SF ,Cons), protect consumer(Cons), fail ). tabled t(X,SF ) :body1 (X), new answer(t(X),SF ). tabled t(X,SF ) :-.

(56) 24. Chapter 2. Tabling Background body2 (X), new answer(t(X),SF ). tabled t( ,SF ) :- complete(SF ).. The functionality of the newly introduced built-ins is as follows: lookup call/2 always succeeds. It finds (or inserts) the tabled call t(X) in the. table space returning a handle to its subgoal frame, SF , and in doing so it also determines its status: whether the subgoal is new to the evaluation or not and whether it is already complete or it is being evaluated. Note that we consider here variant tabling, where two subgoals calls/answers are considered the same if they are identical up to variable renaming. If the subgoal was just inserted in the table (i.e. it is new), the current subgoal is a generator and the execution resolves against program clauses (nodes 1 and 3 in Figure 2.1). Then execution of the program clauses happens by calling the tabled t/2 predicate, which takes SF as an extra argument for tabling control rea-. sons. If the subgoal is not new to the evaluation, the current subgoal is a consumer and no clauses of the original predicate t/1 are executed. If the corresponding table entry is complete, consumption of answers from the table space can be initiated by the built-in consume answers/2 which accesses the answer table — where solutions to tabled predicated are stored — through its second argument. Otherwise, the corresponding generator is being evaluated and a consumer pointed by CON S is created by create consumer/3 (node 6 in Figure 2.1), whose memory will be protected from backtracking by protect consumer/1 to allow a later resumption (implementation details of protect consumer/1 are given in the next section). create consumer/3 also updates the dependency information between generators. which will be used to detect completion. Notice that in the above transformation, there is a fail after protect consumer/1. This means that alternative branches of the computation are explored before the consumer gets to consume any answer. An alternative is to let the consumer first consume the currently available answers. However, the choice between such alternatives belongs to the scheduling strategy, which is an orthogonal issue. Note that the split up of actions between create consumer/3 and protect consumer/1 is entirely for explanatory reasons..

(57) 2.6. Tabling Explained as a Source to Source Transformation. 25. A call to new answer/2 is performed at the end of each clause of the original predicate. If the answer computed by this clause was derived before and already appears in the table space new answer/2 fails (steps 12 and 19 in Figure 2.1); if the answer is new, new answer/2 inserts it in the answer table of the particular tabled call and then computation proceeds normally (steps 9, 10, 15 and 18 in Figure 2.1). Remember that the second argument of new answer/2, SF , contains a pointer to the answer table. Predicate complete/1 is called after all program resolution against clauses of a generator is finished. complete/1 performs the completion fixpoint computation of a generator, scheduling consumers to consume their answers until no more answers are available. If the completion fixpoint computation is finished, execution backtracks to the previous node. Otherwise, a consumer that have not consumed all its answers is resumed. Later on, when the consumer reads all its available answers and it is suspended, the completion fixpoint computation continues. Consumers read answers in the same order they are inserted in the table. A generator is said to be complete when its set of stored answers represent all the conclusions that can be inferred from the set of facts and rules in the program for the tabled call associated with the table space entry. Otherwise, it is said to be incomplete. A generator is thus marked as complete when, after the execution of the completion fixpoint procedure, it is determined that the generator does not depend on previous generators which have not been completed yet (step 20 in Figure 2.1). This dependency appears when there is a consumer of a previous (non-completed) generator in the execution subtree of the generator at hand (after step 12 in Figure 2.1). Completion is non-trivial because a number of subgoals may be mutually dependent, thus forming a Strongly Connected Component(or SCC) [129]. Clearly, we can only complete the subgoals in an SCC together. A SCC is usually represented through its leader node which is the youngest generator node which does not depend on older generators. For example, in Figure 2.1, the leader node for the SCC that includes the subgoals path(1,B) and path(2,B) is node 1. A leader node is also the oldest generator node for its SCC, and defines the next completion.

(58) 26. Chapter 2. Tabling Background. point.. 2.7. Protecting Consumer Memory from Backtracking. In order to explain the implementation details of protect consumer/1 we assume some familiarity with the usual implementation model of Prolog: the Warren abstract machine (WAM)[2], although we give a short introduction here.. 2.7.1. Warren Abstract Machine. A WAM is a stack-based architecture with simple data structures and a low-level instruction set. We assume a four stack WAM, i.e. an implementation with separate stacks for the choicepoints, the environments, the heap and the trail, although this is by no means essential to this thesis. For stack representation, it is assumed that stacks grow downwards; i.e. higher in the stack means older, lower in the stack means younger or more recent. An explanation for the different WAM stacks follows: • Heap stack : stores dynamic terms created on execution time. The topmost heap cell is pointed by the H register. New terms are created on top of H and then, H is updated. • Trail stack : saves information about variable bindings in order to undo these binding on backtracking before executing an alternative execution path. The topmost trail cell is pointed by the TR register. • Environment stack or local stack : keeps information about the environments of the predicate calls. Environments (or frames) are pushed into the local stack when a clause whose body contains more than a subgoal is called. They are popped off when the last subgoal of the body is executed. The current frame is pointed by the E register and the topmost one by the EB register. Frames keep information about the previous active frame (to.

(59) 2.7. Protecting Consumer Memory from Backtracking. 27. be reinstalled when a frame is popped off), the program counter of the execution when a frame is reactivated, and pointers to the local variables of the clause execution. • Choicepoint stack : keeps information about alternative execution paths. Choice points are pushed onto the choicepoint stack when a predicate with different alternatives is executed. Choicepoints store information to reset the execution state at the point of the choicepoint creation. They also keep a pointer to the next alternative to be executed. Choice points are popped off when the last alternative of the predicate is executed. The topmost choicepoint is pointed by the B register. For the purpose of this thesis, we need a deeply introduction to the unification and backtracking operations. When a variable is unified, a new trail cell is pushed into the trail stack which points to the memory cell of the variable. Variables can live in either the heap or the local stack and they are represented as a pointer to themselves. The trailing of unified variables is needed in order to allow the execution of the backtracking operation. The backtracking operation uses the information stored in the choicepoint pointed to by B. A choicepoint stores the value of H, EB and TR at time of the choicepoint creation. We will use, respectively, the following notation for these fields: B[H], B[EB] and B[TR]. In order to discard the failing alternative, the backtracking operation has to undo all the bindings done by the failing alternative. This is achieved by traversing the trail stack from TR until B[TR] and setting a self-reference into the memory cells pointed to by these trail cells, which represent the variables which were unified by the failing alternative. Also, the heap cells, the local frames and the trail cells which were created after the creation of the choicepoint pointed to by B can be discarded, as they belongs to the failing alternative. This is achieve by doing: H=B[H] EB=B[EB] TR=B[TR].