Originally Posted by
Scriptonaut
I wasn't aware that read may not read it all at once, and I wasn't aware that it may return -1 without error. So basically what I need to do is the following?
Here's the actual code I'd use if I was you.
It takes the file descriptor, a pointer to a buffer pointer, a pointer to the buffer size, a pointer to the size of data already in buffer, and the size to reserve at the end of the buffer, as parameters. It reads everything from the descriptor (until end of input) into the buffer, dynamically growing it if necessary.
You can reuse a previous buffer, allocate an initial one, or set the pointed-to values to NULL, 0, 0. Remember to free() the buffer after you no longer need it. See the example at end of this message.
The function ignores signal delivery interrupts, but will return an error if the descriptor is nonblocking and no data is immediately available. (File descriptors are blocking unless you specifically ask or set it nonblocking.)
The function interface is such that if you deem an error unimportant, you can simply call the function again, to read the rest of the input. You can even read data sequentially from multiple sources into one buffer. You can even pre-fill the buffer with your own data.
I would not normally show such complete code, but this is so often implemented only partially, or downright wrong, that I feel it should be shown completely.
Code:
#include <unistd.h>
#include <stdlib.h>
#include <errno.h>
/* If there are less than READ_MIN bytes available for
* incoming data in the buffer, the buffer is reallocated. */
#define READ_MIN 4096
/* When reallocating, the buffer is resized
* for this amount of incoming data. */
#define READ_MAX 131072
/* Read everything from a descriptor into a dynamically allocated buffer.
* The function will return zero if success, errno otherwise.
* descriptor: File descriptor to read from.
* dataptr: Pointer to the buffer pointer.
* sizeptr: Pointer to the allocated size of the buffer.
* usedptr: Pointer to the number of chars in the buffer.
* reserve: Number of chars to reserve after the buffer.
* The buffer must be either dynamically allocated,
* or initialized to NULL,0,0.
*/
int read_all(int const descriptor,
char **const dataptr,
size_t *const sizeptr,
size_t *const usedptr,
size_t const reserve)
{
if (descriptor != -1 && dataptr && sizeptr && usedptr) {
char *data = *dataptr;
size_t size = *sizeptr;
size_t used = *usedptr;
ssize_t n;
while (1) {
/* Need to reallocate the buffer? */
if (used + READ_MIN + reserve > size) {
size = used + READ_MAX + reserve;
data = realloc(data, size);
if (!data)
return errno = ENOMEM;
/* Update data and size for the caller. */
*dataptr = data;
*sizeptr = size;
}
/* Read more data. */
do {
n = read(descriptor, data + used, size - used - reserve);
} while (n == (ssize_t)-1 && errno == EINTR);
/* Error? If so, errno is already set. */
if (n == (ssize_t)-1)
return errno;
/* Rare I/O error? */
if (n < (ssize_t)-1)
return errno = EIO;
/* End of input? */
if (n == (ssize_t)0)
return errno = 0;
/* We have n more chars read. */
used += n;
*usedptr = used;
}
} else {
/* descriptor is -1, or one of the pointers NULL. */
return errno = EINVAL;
}
}
Here is an example main() to explore how the function works. I'll even throw in an example trim(), that converts the data into a string (adding a '\0' at the end), replacing all ASCII control characters and whitespace with a single space, and trimming out leading and trailing control characters and whitespace.
Code:
#include <stdio.h>
#include <string.h>
#include "read.h"
void trim(char *const data, size_t len)
{
size_t i = 0;
size_t o = 0;
while (i < len)
if (data[i] >= 0 && data[i] <= 32) {
/* data[i] is an ASCII whitespace or control character. Skip. */
while (i < len && data[i] >= 0 && data[i] <= 32)
i++;
/* Add separator, but only between tokens. */
if (i < len && o > 0)
data[o++] = ' ';
} else
data[o++] = data[i++];
/* Note: o may be len, so this may be data[len] = '\0'. */
data[o] = '\0';
}
int main(void)
{
char *data = NULL;
size_t size = 0;
size_t len = 0;
if (read_all(STDIN_FILENO, &data, &size, &len, 1)) {
fprintf(stderr, "Error reading from standard input: %s.\n", strerror(errno));
fflush(stderr);
/* Do not abort, though. */
}
if (len > 0) {
/* Trim whitespace from the input data, and append '\0'.
* The final '\0' is why we reserved 1 char in read_all(). */
trim(data, len);
}
if (len > 0) {
printf("Read %lu chars of input, into a %lu char buffer.\n",
(unsigned long)len, (unsigned long)size);
printf("Trimmed down to '%s'.\n", data);
} else
printf("No input.\n");
free(data);
data = NULL;
size = 0;
len = 0;
return 0;
}
Questions?
Originally Posted by
Scriptonaut
I'm not using malloc
Then you're doing it wrong.
I hope you get your program working. I can imagine the contortions you have to do to make it work with just local variables (on stack)..
Originally Posted by
Scriptonaut
Alright, I'm starting to get this now. How exactly does read know when to stop reading?
Whenever read() returns zero, it means there is no more data to read. Either you are at the end of the file, or the other end of the pipe or socket closed the connection.
(On the command line, pressing Ctrl+D causes that to happen; it just does not close the connection. It only tells the process reading the input you do not intend to provide any more input.)
Originally Posted by
Scriptonaut
Ya, I am taking a single $(cmd ...) found in a string, and then expanding it. Since the entire cmd expansion system is recursive, it automatically separates it at whitespaces(and condenses whitespace to a single space)
In that case, you could use the data read by read_all() , remove any embedded '\0', then append an end-of-string '\0' to the data -- just like my trim() function does in the example. That converts the data into a string you can supply it to your command expansion system.
Originally Posted by
Scriptonaut
I'm having trouble following this part. Read will automatically separate the buffer into tokens for me?
No, I meant you will. Read does not do anything to the data.