Goto

From The Right Wiki
Jump to navigationJump to search
File:GOTOkey(ZXSpectrum).jpg
"GOTO" key on the 1982 ZX Spectrum home computer, implemented with native BASIC (one-key command entry).

Goto is a statement found in many computer programming languages. It performs a one-way transfer of control to another line of code; in contrast a function call normally returns control. The jumped-to locations are usually identified using labels, though some languages use line numbers. At the machine code level, a goto is a form of branch or jump statement, in some cases combined with a stack adjustment. Many languages support the goto statement, and many do not (see § language support). The structured program theorem proved that the goto statement is not necessary to write programs that can be expressed as flow charts; some combination of the three programming constructs of sequence, selection/choice, and repetition/iteration are sufficient for any computation that can be performed by a Turing machine, with the caveat that code duplication and additional variables may need to be introduced.[1] The use of goto was formerly common, but since the advent of structured programming in the 1960s and 1970s, its use has declined significantly. It remains in use in certain common usage patterns, but alternatives are generally used if available. In the past, there was considerable debate in academia and industry on the merits of the use of goto statements. The primary criticism is that code that uses goto statements is harder to understand than alternative constructions. Debates over its (more limited) uses continue in academia and software industry circles.

Usage

goto label The goto statement is often combined with the if statement to cause a conditional transfer of control. IF condition THEN goto label Programming languages impose different restrictions with respect to the destination of a goto statement. For example, the C programming language does not permit a jump to a label contained within another function,[2] however jumps within a single call chain are possible using the setjmp/longjmp functions.

Criticism

At the pre-ALGOL meeting held in 1959, Heinz Zemanek explicitly cast doubt on the necessity of GOTO statements; at the time no one[citation needed] paid attention to his remark, including Edsger W. Dijkstra, who later became the iconic opponent of GOTO.[3] The 1970s and 1980s saw a decline in the use of GOTO statements in favor of the structured programming paradigm, with GOTO criticized as leading to unmaintainable spaghetti code. Some programming style coding standards, for example the GNU Pascal Coding Standards, recommend against the use of GOTO statements.[4] The Böhm–Jacopini proof (1966) did not settle the question of whether to adopt structured programming for software development, partly because the construction was more likely to obscure a program than to improve it because its application requires the introduction of additional local variables.[5] It did, however, spark a prominent debate among computer scientists, educators, language designers and application programmers that saw a slow but steady shift away from the formerly ubiquitous use of the GOTO. Probably the most famous criticism of GOTO is a 1968 letter by Edsger Dijkstra called "Go-to statement considered harmful".[3] In that letter, Dijkstra argued that unrestricted GOTO statements should be abolished from higher-level languages because they complicated the task of analyzing and verifying the correctness of programs (particularly those involving loops).[6] The letter itself sparked a debate, including a "'GOTO Considered Harmful' Considered Harmful" letter[7] sent to Communications of the ACM (CACM) in March 1987, as well as further replies by other people, including Dijkstra's On a Somewhat Disappointing Correspondence.[8] An alternative viewpoint is presented in Donald Knuth's Structured Programming with go to Statements, which analyzes many common programming tasks and finds that in some of them GOTO is the optimal language construct to use.[9] In The C Programming Language, Brian Kernighan and Dennis Ritchie warn that goto is "infinitely abusable", but also suggest that it could be used for end-of-function error handlers and for multi-level breaks from loops.[10] These two patterns can be found in numerous subsequent books on C by other authors;[11][12] [13][14] a 2007 introductory textbook notes that the error handling pattern is a way to work around the "lack of built-in exception handling within the C language".[11] Other programmers, including Linux kernel designer and coder Linus Torvalds or software engineer and book author Steve McConnell, also object to Dijkstra's point of view, stating that GOTOs can be a useful language feature, improving program speed, size and code clarity, but only when used in a sensible way by a comparably sensible programmer.[15][16] According to computer science professor John Regehr, in 2013, there were about 100,000 instances of goto in the Linux kernel code.[17] Other academics took a more extreme viewpoint and argued that even instructions like break and return from the middle of loops are bad practice as they are not needed in the Böhm–Jacopini result, and thus advocated that loops should have a single exit point.[18] For instance, Bertrand Meyer wrote in his 2009 textbook that instructions like break and continue "are just the old goto in sheep's clothing".[19] A slightly modified form of the Böhm–Jacopini result, however, allows the avoidance of additional variables in structured programming, as long as multi-level breaks from loops are allowed.[20] Because some languages like C don't allow multi-level breaks via their break keyword, some textbooks advise the programmer to use goto in such circumstances.[14] The MISRA C 2004 standard bans goto, continue, as well as multiple return and break statements.[21] The 2012 edition of the MISRA C standard downgraded the prohibition on goto from "required" to "advisory" status; the 2012 edition has an additional, mandatory rule that prohibits only backward, but not forward jumps with goto.[22][23] FORTRAN introduced structured programming constructs in 1978, and in successive revisions the relatively loose semantic rules governing the allowable use of goto were tightened; the "extended range" in which a programmer could use a GOTO to leave and re-enter a still-executing DO loop was removed from the language in 1978,[24] and by 1995 several forms of Fortran GOTO, including the Computed GOTO and the Assigned GOTO, had been deleted.[25] Some widely used modern programming languages such as Java and Python lack the GOTO statement – see language support – though most provide some means of breaking out of a selection, or either breaking out of or moving on to the next step of an iteration. The viewpoint that disturbing the control flow in code is undesirable may be seen in the design of some programming languages, for instance Ada[26] visually emphasizes label definitions using angle brackets. Entry 17.10 in comp.lang.c FAQ list[27] addresses the issue of GOTO use directly, stating

Programming style, like writing style, is somewhat of an art and cannot be codified by inflexible rules, although discussions about style often seem to center exclusively around such rules. In the case of the goto statement, it has long been observed that unfettered use of goto's quickly leads to unmaintainable spaghetti code. However, a simple, unthinking ban on the goto statement does not necessarily lead immediately to beautiful programming: an unstructured programmer is just as capable of constructing a Byzantine tangle without using any goto's (perhaps substituting oddly-nested loops and Boolean control variables, instead). Many programmers adopt a moderate stance: goto's are usually to be avoided, but are acceptable in a few well-constrained situations, if necessary: as multi-level break statements, to coalesce common actions inside a switch statement, or to centralize cleanup tasks in a function with several error returns. (...) Blindly avoiding certain constructs or following rules without understanding them can lead to just as many problems as the rules were supposed to avert. Furthermore, many opinions on programming style are just that: opinions. They may be strongly argued and strongly felt, they may be backed up by solid-seeming evidence and arguments, but the opposing opinions may be just as strongly felt, supported, and argued. It's usually futile to get dragged into "style wars", because on certain issues, opponents can never seem to agree, or agree to disagree, or stop arguing.

Common usage patterns

While overall usage of goto has been declining, there are still situations in some languages where a goto provides the shortest and most straightforward way to express a program's logic (while it is possible to express the same logic without gotos, the equivalent code will be longer and often more difficult to understand). In other languages, there are structured alternatives, notably exceptions and tail calls. Situations in which goto is often useful include:

  • To make the code more readable and easier to follow[15][28]
  • To make smaller programs, and get rid of code duplication[15][28]
  • Implement a finite-state machine, using a state transition table and goto to switch between states (in absence of tail call elimination), particularly in automatically generated C code.[29] For example, goto in the canonical LR parser.
  • Implementing multi-level break and continue if not directly supported in the language; this is a common idiom in C.[14] Although Java reserves the goto keyword, it doesn't actually implement it. Instead, Java implements labelled break and labelled continue statements.[30] According to the Java documentation, the use of gotos for multi-level breaks was the most common (90%) use of gotos in C.[31] Java was not the first language to take this approach—forbidding goto, but providing multi-level breaks— the BLISS programming language (more precisely the BLISS-11 version thereof) preceded it in this respect.[32]
  • Surrogates for single-level break or continue (retry) statements when the potential introduction of additional loops could incorrectly affect the control flow. This practice has been observed in Netbsd code.[33]
  • Error handling (in absence of exceptions), particularly cleanup code such as resource deallocation.[11][14][33][29][34] C++ offers an alternative to goto statement for this use case, which is : Resource Acquisition Is Initialization (RAII) through using destructors or using try and catch exceptions used in Exception handling.[35] setjmp and longjmp are another alternative, and have the advantage of being able to unwind part of the call stack.
  • Popping the stack in, e.g., Algol, PL/I.
  • Specialized scripting languages that operate in a linear manner, such as a dialogue system for video games.[36]

These uses are relatively common in C, but much less common in C++ or other languages with higher-level features.[34] However, throwing and catching an exception inside a function can be extraordinarily inefficient in some languages; a prime example is Objective-C, where a goto is a much faster alternative.[37] Another use of goto statements is to modify poorly factored legacy code, where avoiding a goto would require extensive refactoring or code duplication. For example, given a large function where only certain code is of interest, a goto statement allows one to jump to or from only the relevant code, without otherwise modifying the function. This usage is considered code smell,[38] but finds occasional use.

Alternatives

Structured programming

The modern notion of subroutine was invented by David Wheeler when programming the EDSAC. To implement a call and return on a machine without a subroutine call instruction, he used a special pattern of self-modifying code, known as a Wheeler jump.[39] This resulted in the ability to structure programs using well-nested executions of routines drawn from a library. This would not have been possible using only goto, since the target code, being drawn from the library, would not know where to jump back to. Later, high-level languages such as Pascal were designed around support for structured programming, which generalized from subroutines (also known as procedures or functions) towards further control structures such as:

These new language mechanisms replaced equivalent flows which previously would have been written using gotos and ifs. Multi-way branching replaces the "computed goto" in which the instruction to jump to is determined dynamically (conditionally). Under certain conditions it is possible to eliminate local goto statements of legacy programs by replacing them with multilevel loop exit statements.[40]

Exceptions

In practice, a strict adherence to the basic three-structure template of structured programming yields highly nested code, due to inability to exit a structured unit prematurely, and a combinatorial explosion with quite complex program state data to handle all possible conditions. Two solutions have been generally adopted: a way to exit a structured unit prematurely, and more generally exceptions – in both cases these go up the structure, returning control to enclosing blocks or functions, but do not jump to arbitrary code locations. These are analogous to the use of a return statement in non-terminal position – not strictly structured, due to early exit, but a mild relaxation of the strictures of structured programming. In C, break and continue allow one to terminate a loop or continue to the next iteration, without requiring an extra while or if statement. In some languages multi-level breaks are also possible. For handling exceptional situations, specialized exception handling constructs were added, such as try/catch/finally in Java. The throw-catch exception handling mechanisms can also be easily abused to create non-transparent control structures, just like goto can be abused.[41]

Tail calls

In a paper delivered to the ACM conference in Seattle in 1977, Guy L. Steele summarized the debate over the GOTO and structured programming, and observed that procedure calls in the tail position of a procedure can be most optimally treated as a direct transfer of control to the called procedure, typically eliminating unnecessary stack manipulation operations.[42] Since such "tail calls" are very common in Lisp, a language where procedure calls are ubiquitous, this form of optimization considerably reduces the cost of a procedure call compared to the GOTO used in other languages. Steele argued that poorly implemented procedure calls had led to an artificial perception that the GOTO was cheap compared to the procedure call. Steele further argued that "in general procedure calls may be usefully thought of as GOTO statements which also pass parameters, and can be uniformly coded as machine code JUMP instructions", with the machine code stack manipulation instructions "considered an optimization (rather than vice versa!)".[42] Steele cited evidence that well optimized numerical algorithms in Lisp could execute faster than code produced by then-available commercial Fortran compilers because the cost of a procedure call in Lisp was much lower. In Scheme, a Lisp dialect developed by Steele with Gerald Jay Sussman, tail call optimization is mandatory.[43] Although Steele's paper did not introduce much that was new to computer science, at least as it was practised at MIT, it brought to light the scope for procedure call optimization, which made the modularity-promoting qualities of procedures into a more credible alternative to the then-common coding habits of large monolithic procedures with complex internal control structures and extensive state data. In particular, the tail call optimizations discussed by Steele turned the procedure into a credible way of implementing iteration through single tail recursion (tail recursion calling the same function). Further, tail call optimization allows mutual recursion of unbounded depth, assuming tail calls – this allows transfer of control, as in finite state machines, which otherwise is generally accomplished with goto statements.

Coroutines

Coroutines are a more radical relaxation of structured programming, allowing not only multiple exit points (as in returns in non-tail position), but also multiple entry points, similar to goto statements. Coroutines are more restricted than goto, as they can only resume a currently running coroutine at specified points – continuing after a yield – rather than jumping to an arbitrary point in the code. A limited form of coroutines are generators, which are sufficient for some purposes. Even more limited are closures – subroutines which maintain state (via static variables), but not execution position. A combination of state variables and structured control, notably an overall switch statement, can allow a subroutine to resume execution at an arbitrary point on subsequent calls, and is a structured alternative to goto statements in the absence of coroutines; this is a common idiom in C, for example.

Continuations

A continuation is similar to a GOTO in that it transfers control from an arbitrary point in the program to a previously marked point. A continuation is more flexible than GOTO in those languages that support it, because it can transfer control out of the current function, something that a GOTO cannot do in most structured programming languages. In those language implementations that maintain stack frames for storage of local variables and function arguments, executing a continuation involves adjusting the program's call stack in addition to a jump. The longjmp function of the C programming language is an example of an escape continuation that may be used to escape the current context to a surrounding one. The Common Lisp GO operator also has this stack unwinding property, despite the construct being lexically scoped, as the label to be jumped to can be referenced from a closure. In Scheme, continuations can even move control from an outer context to an inner one if desired. This almost limitless control over what code is executed next makes complex control structures such as coroutines and cooperative multitasking relatively easy to write.[43]

Message passing

In non-procedural paradigms, goto is less relevant or completely absent. One of the main alternatives is message passing, which is of particular importance in concurrent computing, interprocess communication, and object oriented programming. In these cases, the individual components do not have arbitrary transfer of control, but the overall control may be scheduled in complex ways, such as via preemption. The influential languages Simula and Smalltalk were among the first to introduce the concepts of messages and objects. By encapsulating state data, object-oriented programming reduced software complexity to interactions (messages) between objects.

Variations

There are a number of different language constructs under the class of goto statements.

Computed GOTO and Assigned GOTO

In Fortran, a computed GOTO jumps to one of several labels in a list, based on the value of an expression. An example is goto (20,30,40) i.[44] The equivalent construct in C is the switch statement, and in newer Fortran a SELECT CASE construct is the recommended syntactical alternative.[45] BASIC had a 'On GoTo' statement that achieved the same goal, but in Visual Basic this construct is no longer supported.[46] In versions prior to Fortran 95, Fortran also had an assigned goto variant that transfers control to a statement label (line number) which is stored in (assigned to) an integer variable. Jumping to an integer variable that had not been ASSIGNed to was unfortunately possible, and was a major source of bugs involving assigned gotos.[47] The Fortran assign statement only allows a constant (existing) line number to be assigned to the integer variable. However, some compilers allowed accidentally treating this variable as an integer thereafter, for example increment it, resulting in unspecified behavior at goto time. The following code demonstrates the behavior of the goto i when line i is unspecified:

assign 200 to i
i = i+1
goto i ! unspecified behavior
200 write(*,*) "this is valid line number"

Several C compilers implement two non-standard C/C++ extensions relating to gotos originally introduced by gcc.[48] The GNU extension allows the address of a label inside the current function to be obtained as a void* using the unary, prefix label value operator &&. The goto instruction is also extended to allow jumping to an arbitrary void* expression. This C extension is referred to as a computed goto in documentation of the C compilers that support it; its semantics are a superset of Fortran's assigned goto, because it allows arbitrary pointer expressions as the goto target, while Fortran's assigned goto doesn't allow arbitrary expressions as jump target.[49] As with the standard goto in C, the GNU C extension allows the target of the computed goto to reside only in the current function. Attempting to jump outside the current function results in unspecified behavior.[49] Some variants of BASIC also support a computed GOTO in the sense used in GNU C, i.e. in which the target can be any line number, not just one from a list. For example, in MTS BASIC one could write GOTO i*1000 to jump to the line numbered 1000 times the value of a variable i (which might represent a selected menu option, for example).[50] PL/I label variables achieve the effect of computed or assigned GOTOs.

ALTER

Up to the 1985 ANSI COBOL standard had the ALTER statement which could be used to change the destination of an existing GO TO, which had to be in a paragraph by itself.[51] The feature, which allowed polymorphism, was frequently condemned and seldom used.[52]

Perl GOTO

In Perl, there is a variant of the goto statement that is not a traditional GOTO statement at all. It takes a function name and transfers control by effectively substituting one function call for another (a tail call): the new function will not return to the GOTO, but instead to the place from which the original function was called.[53]

Emulated GOTO

There are several programming languages that do not support GOTO by default. By using GOTO emulation, it is still possible to use GOTO in these programming languages, albeit with some restrictions. One can emulate GOTO in Java,[54] JavaScript,[55] and Python.[56][57]

PL/I label variables

PL/I has the data type LABEL, which can be used to implement both the "assigned goto" and the "computed goto." PL/I allows branches out of the current block. A calling procedure can pass a label as an argument to a called procedure which can then exit with a branch. The value of a label variable includes the address of a stack frame, and a goto out of block pops the stack.

/* This implements the equivalent of */
/* the assigned goto                 */
declare where label;
where = somewhere;
goto where;
...
somewhere: /* statement */ ;
...
/* This implements the equivalent of */
/* the computed goto                 */
declare where (5) label;
declare inx fixed;
where(1) = abc;
where(2) = xyz;
...
goto where(inx);
...
abc: /* statement */ ;
...
xyz: /* statement */ ;
...

A simpler way to get an equivalent result is using a label constant array that doesn't even need an explicit declaration of a LABEL type variable:

/* This implements the equivalent of */
/* the computed goto                 */
declare inx fixed;
...
goto where(inx);
...
where(1): /* statement */ ;
...
where(2): /* statement */ ;
...

MS/DOS GOTO

In a DOS batch file, Goto directs execution to a label that begins with a colon. The target of the Goto can be a variable.

@echo off
SET D8str=%date%
SET D8dow=%D8str:~0,3%
FOR %%D in (Mon Wed Fri) do if "%%D" == "%D8dow%" goto SHOP%%D
echo Today, %D8dow%, is not a shopping day.
goto end
:SHOPMon
echo buy pizza for lunch - Monday is Pizza day.
goto end
:SHOPWed
echo buy Calzone to take home - today is Wednesday.
goto end
:SHOPFri
echo buy Seltzer in case somebody wants a zero calorie drink.
:end

Language support

Many languages support the goto statement, and many do not. In Java, goto is a reserved word, but is unusable, although compiled .class files generate GOTOs and LABELs.[58] Python does not have support for goto, although there are several joke modules that provide it.[56][57] There is no goto statement in Seed7 and hidden gotos like break- and continue-statements are also omitted.[59] In PHP there was no native support for goto until version 5.3 (libraries were available to emulate its functionality).[60] C# and Visual Basic .NET both support goto.[61][62] However, it does not allow jumping to a label outside of the current scope, and respects object disposal and finally constructs, making it significantly less powerful and dangerous than the goto keyword in other programming languages. It also makes case and default statements labels, whose scope is the enclosing switch statement; goto case or goto default is often used as an explicit replacement for implicit fallthrough, which C# disallows. The PL/I programing language has a GOTO statement that unwinds the stack for an out of block transfer and does not permit a transfer into a block from outside of it. Other languages may have their own separate keywords for explicit fallthroughs, which can be considered a version of goto restricted to this specific purpose. For example, Go uses the fallthrough keyword and doesn't allow implicit fallthrough at all,[63] while Perl 5 uses next for explicit fallthrough by default, but also allows setting implicit fallthrough as default behavior for a module. Most languages that have goto statements call it that, but in the early days of computing, other names were used. For example, in MAD the TRANSFER TO statement was used.[64] APL uses a right pointing arrow, for goto. C has goto, and it is commonly used in various idioms, as discussed above. Functional programming languages such as Scheme generally do not have goto, instead using continuations.

See also

Notes

  1. Watt & Findlay 2004.
  2. Kernighan & Ritchie 1988, p. 224, A9.6 Jump Statements.
  3. 3.0 3.1 Dijkstra 1968.
  4. GNU Pascal development team 2005, 5.1 Assorted Pascal Programming Tips.
  5. Louden & Lambert 2012.
  6. "The unbridled use of the goto statement has as an immediate consequence that it becomes terribly hard to find a meaningful set of coordinates in which to describe the process progress. ... The 'go to' statement as it stands is just too primitive, it is too much an invitation to make a mess of one's program."
  7. Rubin 1987.
  8. Dijkstra, Edsger W. On a Somewhat Disappointing Correspondence (EWD-1009) (PDF). E.W. Dijkstra Archive. Center for American History, University of Texas at Austin. (transcription) (May, 1987)
  9. Knuth 1974.
  10. Kernighan & Ritchie 1988, pp. 65–66, 3.8 Goto and Labels.
  11. 11.0 11.1 11.2 Vine 2007, p. 262.
  12. Geisler 2011.
  13. Prata 2013.
  14. 14.0 14.1 14.2 14.3 Sahni & Cmelik 1995.
  15. 15.0 15.1 15.2 Andrews 2003.
  16. McConnell 2004.
  17. Regehr 2013.
  18. Roberts 1995.
  19. Meyer 2009.
  20. Kozen & Tseng 2008.
  21. Stack Overflow Questions 2012.
  22. Pitchford & Tapp 2013.
  23. Williams 2013.
  24. ANSI X3.9-1978. American National Standard – Programming Language FORTRAN. American National Standards Institute. Also known as ISO 1539-1980, informally known as FORTRAN 77
  25. ISO/IEC 1539-1:1997. Information technology – Programming languages – Fortran – Part 1: Base language. Informally known as Fortran 95. There are a further two parts to this standard. Part 1 has been formally adopted by ANSI.
  26. Barnes 2006.
  27. Summit 1995.
  28. 28.0 28.1 Torvalds 2016.
  29. 29.0 29.1 Cozens 2004.
  30. Java Tutorial 2012.
  31. Gosling & McGilton 1996.
  32. Brender 2002, pp. 960–965.
  33. 33.0 33.1 Spinellis 2003.
  34. 34.0 34.1 Allain 2019.
  35. Stroustrup 2012.
  36. Hoad, Nathan (28 July 2022). "nathanhoad/godot_dialogue_manager". GitHub. Retrieved 3 February 2023.
  37. Chisnall 2012.
  38. Contieri 2021.
  39. Wilkes, Wheeler & Gill 1951.
  40. Ramshaw 1988.
  41. Siedersleben 2006.
  42. 42.0 42.1 Steele 1977.
  43. 43.0 43.1 Kelsey, Clinger & Rees 1998.
  44. , which means that the program jumps to label 20, 30 or 40, in case that i is either less than, equal to or greater than zero.
  45. Lahey Computer Systems, Inc 2004.
  46. Microsoft 2021.
  47. Wehr 1997.
  48. z/OS 2.5.0 in IBM Documentation 2021.
  49. 49.0 49.1 GCC, the GNU Compiler Collection 2021.
  50. Fronczak & Lubbers 1974, p. 226.
  51. The ALTER statement was deemed obsolete in the COBOL 1985 standard and deleted in 2002; see COBOL > Self-modifying code
  52. Van Tassel 2004.
  53. Perl syntax manual 2021.
  54. GOTO for Java 2009.
  55. Sexton 2012.
  56. 56.0 56.1 Hindle 2004.
  57. 57.0 57.1 Noack et al. 2015.
  58. Gosling et al. (2005) Unlike C and C++, the Java programming language has no goto statement; identifier statement labels are used with break (§14.15) or continue (§14.16) statements appearing anywhere within the labeled statement. The keywords const and goto are reserved, even though they are not currently used. This may allow a Java compiler to produce better error messages if these C++ keywords incorrectly appear in programs.
  59. Manual for the Seed7 programming language 2021.
  60. PHP Manual 2021.
  61. Wagner 2021.
  62. "GoTo Statement - Visual Basic | Microsoft Learn". Microsoft Learn. 15 September 2021. Retrieved 25 September 2023.
  63. The Go Programming Language Specification 2021.
  64. Galler 1962, pp. 26–28, 197, 211.

References