Thread: A programming languag with one token

  1. #1
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396

    A programming languag with one token

    We've been seeing a lot of riddles lately. Somebody mentioned constructing a computer which has only a single instruction. This is a somewhat related problem.

    Construct a programming language that uses only a single token. For consistency, let's use '@' as the token. There is no whitespace -- nothing except '@'. The language you construct must be Turing-complete. Implement an interpreter for this language.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  2. #2
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    Clarify the restriction, please. A program is a single string consisting exclusively of a variable number of '@'?
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  3. #3
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by CornedBee View Post
    Clarify the restriction, please. A program is a single string consisting exclusively of a variable number of '@'?
    Correct. The restriction on no whitespace is in order to prevent:

    "@ @@@ @@ @@@@ @@ @@"

    I.e., using independent runs of '@' to stand in for more meaningful tokens.

    This problem IS solvable. In addition, programs are allowed to be of any length -- in fact, most useful programs will probably be extremely long. You don't have to prove your interpreter works by direct example, but you have to provide a clear explanation of how and why it works.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  4. #4
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    Doesn't that mean that you could run-length compress any program into a single integer?

    So basically, every program consists of a single integer?

    Then I write this integer in base-4 representation and implement the navigation/modification part of BrainF, using the individual digits as the four operations (left, right, inc, dec). Voila, turing-complete language.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  5. #5
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by CornedBee View Post
    Doesn't that mean that you could run-length compress any program into a single integer?

    So basically, every program consists of a single integer?

    Then I write this integer in base-4 representation and implement the navigation/modification part of BrainF, using the individual digits as the four operations (left, right, inc, dec). Voila, turing-complete language.
    Okay, I guess it was too easy.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  6. #6
    Registered User
    Join Date
    Oct 2008
    Posts
    1,262
    once I made a programming language. Written an interpreter and stuff. I made the language to be quit hard.

    Every line was a set of tokens (tokens may consist of anything but tabs and newlines), separated by tabs. The meaning of the line was defined by the number of tabs before the first token:
    - 0 means a function definition, the first token being the function name, the ones that followed the parameters
    - 1 means a function body: an instruction, the first token being the function name to call and the rest the arguments to call it with.
    - 2 means an include file. The token is a filename that will be inserted similar to C's #include.

    Now, there's one trick: every variable was a list of either an empty or a list of other lists. I called the language LiMa (List Mania, and it happened to be the name of a city I drove through on my road-trip through the US). But I never published the language.
    There are no integers. No strings. Everything lists. But completely turing complete.

    A few special functions:
    ">", with two tokens A and B, would insert A into the list B.
    "<", with two tokens A and B, would get the last inserted element from B into A, or make A an empty list B was an empty list.
    "|", with one token A, would display a single character, namely the one with as ASCII value being the number of items in the list A.
    "||", with one token A, would input a single character and make A a list with the number of items being the ASCII value of the character.

    Some more rules:
    - An unused variable starts as an empty list.
    - A line of a function body may start with a "?", followed by a tab, followed by an identifier A. The function of the rest of the line would only be executed if the list A was not empty.
    - The function main is called upon start of the program.
    - The arguments of a function will first be copied, and then copied back from left to right. So not really called by reference, but call by copy-back.


    I studied this language for way too long, and found some tricks I never imagined. I even wrote a fully working Dijkstra implementation once. True, it was really slow. But it worked. And it looks great.



    Code:
    #
    // This comment is actually a function name, way too long. We could theoretically call it.
    // But we won't. That's what kind of makes this a comment.
    // In functions, we make comments by simply calling a function that does nothing. I declared the function # for that.
    // We can use as many parameters as we want, they're simply not used. And the variables will be just texts, being identifiers for empty lists.
    
    add2	a
    	>	*	a
    	>	IAmEmpty	a
    
    add4	a
    	add2	a
    	add2	a
    
    add8	a
    	add4	a
    	add4	a
    
    add16	a
    	add8	a
    	add8	a
    
    add32	a
    	add16	a
    	add16	a
    
    add64	a
    	add32	a
    	add32	a
    
    A	a
    	#	Clean the list, then add 64, then add 1
    	#	We clean the list because * is an empty list. So a becomes empty as well.
    	#	Then we push an empty list to a.
    	<	a	*
    	add64	a
    	>	*	a
    
    main
    	#	First, write 'A' to the screen
    	A	a
    	|	a
    
    	#	Then B
    	>	*	a
    	|	a
    
    	#	Then a newline
    	add8	!
    	add2	!
    	|	!
    
    	#	Just an example of how to use the "?" functionality. It will never actually run, but still:
    	?	EmptyList	main

    Imagine a dijkstra in such a language.
    Imagine a program that inputs an integer and displays it in base 10. Several hundreds of lines of code.

    Now yet another puzzle: Make up a method to store integers (there are two proper ones I know of; one being a lot easier to implement but way slower), and write an addition function for it.
    Of if you don't have much time on your hands: make up something simple. Like a NOT function.

  7. #7
    Malum in se abachler's Avatar
    Join Date
    Apr 2007
    Posts
    3,195
    Simple, the number of @'s represent teh integer that woudl be formed by considering the entire memory space of your program as a single number. Of course any program longer than 32 bytes is going to have problems finding enough matter to represent it.

  8. #8
    The larch
    Join Date
    May 2006
    Posts
    3,573
    Quote Originally Posted by CornedBee View Post
    Doesn't that mean that you could run-length compress any program into a single integer?

    So basically, every program consists of a single integer?

    Then I write this integer in base-4 representation and implement the navigation/modification part of BrainF, using the individual digits as the four operations (left, right, inc, dec). Voila, turing-complete language.
    I don't get it. How would a "hello world" look like? If I'm not mistaken Brain........ can use 30000 bytes of data for storage and the code can be arbitrarily long, your's would have only 4 or 8 bytes for both data and code?
    I might be wrong.

    Thank you, anon. You sure know how to recognize different types of trees from quite a long way away.
    Quoted more than 1000 times (I hope).

  9. #9
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    Huh? I have all the storage of BrainF, and all the code length I need, provided I can find enough storage for that many '@'s.

    Let's say I define the following mapping of the 8 BrainF operations:
    0 = +
    1 = -
    2 = <
    3 = >
    4 = .
    5 = ,
    6 = [
    7 = ]

    The Hello World starts like this:
    Code:
    ++++++++++[
    So to rewrite that in my language, I first take this and remap it (I reverse the order, too).
    Code:
    60000000000
    A program cannot end with a +, but that's OK.
    Now 60000000000 in octal is 6442450944 in decimal, so I just need to write down 6 billion '@'s. Thankfully the language compresses very well, so I can actually do that in reasonable space.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  10. #10
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    IMHO the important thing to realize with this challenge is that the set of all programs can be completely ordered by encoding the program as an integer -- such an ordering is necessary to prove the Halting Theorem
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  11. #11
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Let's make it easy and call this token a 'bit'. We could group these bits as 'bytes'. We could then make something to read these 'bytes', some form of 'byte code reader'.

    We could then save ourselves a lot of time if we just installed Java.
    Hope is the first step on the road to disappointment.

  12. #12
    Complete Beginner
    Join Date
    Feb 2009
    Posts
    312
    Quote Originally Posted by brewbuck View Post
    Somebody mentioned constructing a computer which has only a single instruction.
    That was me.

    IMHO the important thing to realize with this challenge is that the set of all programs can be completely ordered by encoding the program as an integer
    Programs are already encoded as integers on our computers (though the mapping is neither injective nor surjective). But there's another interesting observation: there are obviously infinitely many programs and there's a mapping into the natural numbers, hence the number of programs is countably infinite. On the other hand, problem solutions can
    be represented as a function from the strings into [yes, no], where each string is a question/problem which can be answered by "yes" or "no". The cardinality of this set is |[yes, no]|^|number of finite strings|, which is uncountably infinite (as 2^|X| is the cardinality of the power set of X, and the power set of X is always strictly larger than X).



    Hence, there are infinitely many more problem solutions than there are programs.



    By the way, the traditional approach of encoding arbitrary expressions (e.g. programs, theorems, love letters) has originally been invented by Kurt Gödel in my beloved proof of the Incompleteness Theorem: assign a unique number (ID) to every token, then encode an expression as a product of primes by substituting each token with consecutive prime numbers and use the multiplicity to denote the actual token.

    Code:
    Consider the following statement: %$?%
    Assign IDs: %=1, $=2, ?=3.
    Encode: 2^1 * 3^2 * 5^3 * 7^1 = 15750
    Obviously, one can decode 15750 by applying prime factorization, ordering the prime factors and map their multiplicity back to the original tokens.

    Greets,
    Philip
    All things begin as source code.
    Source code begins with an empty file.
    -- Tao Te Chip

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Connecting to a mysql server and querying problem
    By Diod in forum C++ Programming
    Replies: 8
    Last Post: 02-13-2006, 10:33 AM
  2. Post...
    By maxorator in forum C++ Programming
    Replies: 12
    Last Post: 10-11-2005, 08:39 AM
  3. Dikumud
    By maxorator in forum C++ Programming
    Replies: 1
    Last Post: 10-01-2005, 06:39 AM
  4. Please Help - Problem with Compilers
    By toonlover in forum C++ Programming
    Replies: 5
    Last Post: 07-23-2005, 10:03 AM
  5. Parsing and Tokens (strtok)
    By readerwhiz in forum C Programming
    Replies: 6
    Last Post: 04-22-2002, 09:57 AM