Preamble:
If you don't care about details then the accepted answer suggesting DateTimeFormatter.ofPattern("yyyy MM dd");
is fine. Otherwise if you are interested in the tricky details of parsing then read further:
Regular expressions
As you have already recognized, a complete validation is not possible by using regular expressions like (19|20)dd[- /.](0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])
. For example, this expression would accept "2017-02-31" (February with 31 days???).
Java-8-parsing mechanism
The Java-8-class DateTimeFormatter
however, can invalidate such non-existing dates just by parsing. To go into the details, we have to differentiate between syntactic validation and calendrical validation. The first kind of syntactic validation is performed by the method parseUnresolved().
Parsing is implemented as a two-phase operation. First, the text is
parsed using the layout defined by the formatter, producing a Map of
field to value, a ZoneId and a Chronology. Second, the parsed data is
resolved, by validating, combining and simplifying the various fields
into more useful ones. This method performs the parsing stage but not
the resolving stage.
The main advantage of this method is to not use exception flow which makes this kind of parsing fast. However, the second step of parsing uses exception flow, see also the javadoc of the method parse(CharSequence, ParsePosition)
.
By contrast, this method will throw a DateTimeParseException if an
error occurs, with the exception containing the error index. This
change in behavior is necessary due to the increased complexity of
parsing and resolving dates/times in this API.
IMHO a performancewise limitation. Another drawback is also that the currently available API does not allow to specify a dot OR a hyphen as you have done in your regular expression. The API only offers a construct like "[.][-]" (using optional sections), but the problem is that an input sequence of ".-" would also be okay for Java-8.
Well, these minor disadvantages are mentioned here for completeness. A final almost-perfect solution would be in Java-8:
String input = "2017-02.-31";
DateTimeFormatter dtf =
DateTimeFormatter.ofPattern("yyyy[.][-]MM[.][-]dd").withResolverStyle(
ResolverStyle.STRICT // smart mode truncates to Feb 28!
);
ParsePosition pos = new ParsePosition(0);
TemporalAccessor ta = dtf.parseUnresolved(input, pos); // step 1
LocalDate date = null;
if (pos.getErrorIndex() == -1 && pos.getIndex() == input.length()) {
try {
date = LocalDate.parse(input, dtf); // step 2
} catch (DateTimeException dte) {
dte.printStackTrace(); // in strict mode (see resolver style above)
}
}
System.out.println(date); // 2017-02-28 in smart mode
Important:
- The best possible validation is only possible in strict resolver style.
- The validation proposed also includes a check if there are trailing unparsed chars.
- The result
ta
of method parseUnresolved()
in step 1 cannot be used as intermediate result due to internal limitations of resolving. So this 2-step-approach is also not so overly good for performance. I have not benchmarked it against a normal 1-step-approach but hope that the main author of the new API (S. Colebourne) might have done it, see also for comparison his solution in his own Threeten-extra-library. More or less a hackish workaround to avoid exception flow as much as possible.
- For Java 6+7, there is a backport available.
Alternative
If you look for an alternative but not for SimpleDateFormat
, then you might also find my library Time4J interesting. It supports real OR-logic and avoids exception flow logic as much as possible (highly tuned parsing only in one step). Example:
String input = "2017-02-31";
ParseLog plog = new ParseLog();
PlainDate date =
ChronoFormatter.ofDatePattern(
"uuuu-MM-dd|uuuu.MM.dd", PatternType.CLDR, Locale.ROOT)
.parse(input, plog); // uses smart mode by default and rejects feb 31 in this mode
if (plog.isError()) {
System.out.println(plog.getErrorMessage());
} else {
System.out.println(date);
}
Notes:
- A check of trailing characters can be included in the same way as in Java-8
- The parsed result is easily convertible to
LocalDate
via date.toTemporalAccessor()
- Using the format attribute
Attributes.LENIENCY
would weaken the validation
- Time4J is also available for Java 6+7 (when using version line v3.x)